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THE GENERAL CANONICAL CORRELATION DISTRIBUTION 

llY \\. S. lUlUi !.!'( 

Vuiri mitii ttf Cfimhfiihjt , Kn^ltin4 uttd I ui i^i •ihimf \ifrlh rnruhmi 

1. Summary. Tin* gt-tu-rid I'unimkiil YorjcJsKtjnu *is“.u»lniri»>!i sa g»\i n .in ^ 

niultiiili* innv(*r wricj* in llii* (run ^•nn«nir;^! toTirl.-ihun* p. Whfin iKfih' um* 
trtir I’nrtf'Iution it. mtt Kcin, lliij* .Hctir?. if ;t gi-ticndj/***! Iiyjn.r- 

gnmictiif fuiictinn, fur (In* nwn Intfli nf nun r*<‘ti<f!si tncjiii*. and <if ««»«( 
i>i'upcr. Ill tilt* goiK'nil Ilf more flnui <* 111 ' nmi tnn* rnm'lnlntii tin* 
('nr*tli('ii’iitf in 11 k* f'X|»nnMMn mt flic i'nn«li(inn;il nnimi'nif mI tie* >.is!n|)|i< 

i''irn‘l}Uint!tt iK'tuccu the imirn nf ti'anffnrmf*»l variithlck. tcpiiM'liliiiR I he tun* 
fiinunirnl vuriuhlch, when I he wmiiiie esmniiieai r<iuelafif»ii>. Itctwci’U (hr t :uu|i|<' 
enmiliieal vnrinblefi an* lixeil. Mef jcrtL arc rixcii m 1 ^•!4!lill^n(^, the^c eneffn ii-nb** 
for Imlli eifrCH, uoineential rin'iins and eorrclalioiH projicr; tmti (heir f«*rro uji lo 
(he fourth onler, eonefiHUuling <0 (hp) in (he «'*k|tati-itm, lifted in A}ij»cudi\ L 
Till* (!i*tniled It'niiH iiuiking iijt llicfi* eoellien'iit.H me aiven, in t)ie c'jiw of iwo 
iiMji zcio I’oiielalioiif, ii[i to flu* fourth orrlcr. anil to llu* general euw*, up to the 
fliini order, in Alijieiuliv 11, 

2. Introductory remarks; the case of xero roots. In the ftali«tietii thcon* of 
flic tclation of one veehu* varinte with another (hce Hotelling: |1|>, (he fiinnl* 
timcouf diftrihutum of (he eunonieid eorrchiUotiH r, , uhieh are the rcKiti* of a 
eertHin deUTtiiiniiiitnl eciuiition, was first old aitiwl in iKidier |2|, IIwis l.l], 
Hoy Jlj) in (he epeeial hill impoi liiat. ease ulieii the (rue rootn or eorrelntiops p, 
an* sa'ro, Hoy |.*i| h.'o* smee inveHtigutial the I'liH* where (he true root4i are not 
?/*ro wlieu (hcHe nojoreru vuUich ariw! from nuu-rentnil means. The (inwnt 
investigalhm is primarily intendiHl to cover the nltermitive ea«* where imn mo 
roots arifs* from the existenee of true eorrelutionh p, , The method develojud is, 
however, also appHeahle to (he eaw of non-eeutrat means; ntid it in shown that 
the general diatrihutioii, which for more than one non-jH'ro root hTomea very 
tairnplicatwl, doi-h not in the ease of lum-eenlral meann agna* with the dihtrihn- 
tion given by Hoy Ifil cxce[U in the ease of only one non-aero root.’ 

It will Ih* convenient in tliis introductory wetion to ekclcii fwilh alight modi- 
f»ealioiie‘ the method nml by Bau [dj to obtain the vilnlion in the eiw* of xf*m 
root.i, n.H mmm of hi*- intennwiialc formnliu* are tinelnl for the prwnt develop- 
mentfi, We eomadcr a tlcfieutleiit vector variate with /i ennnKinent«, and an, 
mdepemlent’ vector variate wilii r; cumjameiiie. hbr definiteiicsa we awiime 

' Tliir, ("uadimitm lew ai»i« Ih‘)<u jcwhcil tty T. 'V Ai»lcri*>ni, wln» Itiijs ftivcit a i«t*!Mleir» 
ol itii* itttn wiumi tucHiM* jioililcm ni ihc cant-hnf ciiIut >11 iwn tww riMiitt, iAnfiah 
nf Malh Klal , Vtil. 17 tllMtb, iip, IiHt'tStl. 

‘Tins cla»*tlicali<ta of a viiriaii* as tin* ■•<U*|M*ii<t»‘til vaiiHu*” nr '•n«i(>jH*iidMil variate*' 
w ni the rcgrt**iiti»ii ncii**, imd diws net nci*e»u»nrdy imply ••laiit.liemt ilcfH*i«tcnce or inde- 
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p < g, and the asimple witli “f” 'V drgri**'^ <>f irf'f'd'Tn ’ •trfi’-'yr >''< th"' 
dependent variate is divhlal in the nsnat v,ay tm*. ><, snt^i 3 }v,tc 

Avith g degrees of frmlom ecirre.sjwnthng t‘< the in«l<'iv'3s‘3< n' vwMitf 
remaining part with n — g degr<'es of frr'e^h’rn. Tf 'J.' , ^K- ’?(<■ 

squares and prorlucls corresponding to this division, then jl d * kuo'.ni th it thff 
joint distribution of n,j and h,, , if the dejn-ndonf vo('«»n ■(, s** j-, si s»u s 

actually, in the statistical «c*n«*, indeitendenl of the w'xnjd 


( 1 ) 


I A I * jj I * ’’ •’ 


gine ^lr()>-n 


e.xp 


■ 

fl lUKg “ iJlill'tv 


,, Ih—r ’**0 
* *-! 


}‘ h,,') J 4*1 rfh 


where I A I denotes the determinant of the matrix -1 hiDi.-ind i? *1.' pr- dint 
of differentials dai^, and where for eouveniem-e thv vaTUin--* in,4*Tji«. «?*»* 
dependent variate is taken to 1«* the unit matrix. 

Wc make the transformation K)K!cifnKl hy 


A « ir/rii". 

(2) 

A + n « iiMP, 

whei'e D is a diagonal matrix of the (luanlities r’ in de*Tn*liug « 'sstrr * f 
and Ti^ = (to,/) is a matrix (ivith tranapo.w ir'i uniquely «3et*-r,*mie‘»<i l»y '«t“ 
except for on ambiguity of sign fur each euthunn ; this smlngmly r.m Sk* r 3 )rir.W 5 ii**«| 
by choosing positive cleraonts in tlie first rmi. The JfMMbujn A *4 riio tair*- 
formation may be shmvn to be 

(3) A - 2'’1 iriF I*I iu 


By direct substitution, wc obtain from (1) the distribution 


pidijihj) « V{v)i),r]) so piv;u)p{r\), 

wha-e fix) is a general notation* for a distribution fiinctjcm in nmf *tr He-w* 
variates a:, (including the differential elomenta) ; for and pfr’l we hsive 


1 ^ ^ 
o 


(4) p(wi/) = Cl I W' exp 

(5) pir]) = Csf[ /(r’, (1 - ft (^s 






The probability aymbul la not of course to be eonfvwcrl with ibe mimltfr p .rf 
in the dependent variate, It sbould ntao bo noted tbai for cii(ivcriii*Hr« /*,(/. , ,ts*4 

donoto the joint probability for a set of quanlilioB *, , whemiw pil/,» or pi|j,! ilfwr.to >fce 
prooaiiiiiiy for the specilicd variate x, or x, conaidored scparaielv. 



(;axonk'ai- t'oimKi-vTKVx 


3 


the,! rrinsfiiiits f't ftiid r'j hfinp; amui);»rHl fu giM* unity <iii init-Rratinn *tf pur,, 
<ir p(r\\, i.{‘. \v«* have 


(«) 


(th(‘ ii’.j varying fnim 


■•.It 




17 ) Ct r 


’"fi* !11K« 


in '/ osrrpf that tri, j- ()i,atul 


3. Fonml determination of the general distribution. nit>tlj(Kl Ir* Ik* 
adnptvtl of nldaining thn gnirred (lialrihution from the* particular c!i«* fjuntwl in 
cciuatioii (5) alaivc is the same in ijrinciplc as the one a*Inf)l«l by Kislier j7| in 
hia derivation of the general distrilmtion of the multiple rorrelatittn coefRcienf. 
Since the argiunent is more involved in the prew'nt prolilem, it will la* preatmuHi 
first in formal proljalalily terms, before the details of the wilutimi are examinifl. 

We, consider a transformation of the eomponents of each vector variate to the 
true canonical conrtKincnfs. Is't the obscrvctl ordimiry correlation coeffieienta 
of tliese mulvtally independent etnnponeuta for one vt-etor variate with the 
corresponding components of the second vector vaiiatr- be tlenotetl by j?, , The 
true, enrrelations are the true canonical correlations p, . 'riien we have for tlie 
general cammical correlation distribution denotisi by' p(r, p,f, tla* exiiresrion 


/)(r, I p,i piti , R, I p.) 

/Kr. I 8< , p, )/}(«, i p.) 


1 . 

L 


Vir, 1 ( Pri/dsj , P3> • • • j p^t, 


tlie subslitutimi pl,f; ; «,) for jdr* j «, , p,t following from the sidlirieiicy of the 
inde[!endcnt correlations r, of the c'orre.Hpouding i>airs of canrmica! components, 
w statistics for the p, . We now deltm* (he function , pp by the relation 

p(t>i pi) p(ji, , p, .. D) , Pit. 

whonce* wo have the gc'tieral solution 

p(r, i P.l - / /i(r, ; R,)p(s, j Pi -• Oft/iR, . p|i/a.'o ; pa ■' , pi> • ■ ■ 

/ p(r< , Sf i P< , ppt/i.as , pd ■ ■ ■ 

« Pin Ip, - U) j p{a, i r, , P, 


(8) 


' blfp,«| , pP0i% . P2t 
for pir, I p,) in terms of liie special cast* pir, p, ■ 0), 


* Qiiiniliticrt 111 (III- I'iplit i<f lln* vcrIicMl iilritki* in *1 liwin'l ao* «;«»» ijtwMi,. 

tics nil tt liicli till' iirnliatnlily diKtriliutimi di'iieiiiis 
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an 


( 9 ) 


Now according to tlic iiulopf'ndcnl vwUn i-* * ‘’O 'I'i* t» ,i . i j ?i --ni 

variate with which the (h'pcmh'nt Viiristl*’ ik i-xuida^oL !» i tii.-J T m 
sample space (this includes the imn-eenlrai mejun. ! 1 r i >*•<., '* "ilr < a n i 
the, distribution of the muHipk eorrelatinn R of .& xintih 

independent variate comprising m It. v f.' » , 

(a) p) ^ Fih n, ^ J »n; p‘R^i fl f<' * ", 

(b) g{li, p).^ /'Xi n; J w, I jV* o 

whore we replace, p’ by a parameb'r tf in eawt* lUi, and the f'fr h** , 

geometric functions used is: 

r,/ . . , , orX , a(« 1 })X 

' + -/ + , ,W 4 I.J- * ■" • 

n«„ «.;#;*) = I + 

It follows that we may write f/{si , pi) iilmve in the form 




(10) 


(a) 


Cl(si , p|l « FQ n, I «; <ud; (1 


**! 


(h) l?(si , Pi * ~ P'il n: \ ; ie' 

by putting m = 1 iu (9), (tin* .signs of the me nrbitr „(tv, th.l* I'ln 1«!< < i.^t 1j 
tially concerned, tus in tlie innlliple eturelation disirdmie.ii. .-jili li.r 
of the correlations ) . From these Heri(>» expansiuiix the in’cgi ;i3 ;n (h 
terms corresponding to the oondiliowa truimentii, for nnv «-? xf } 

^ ‘ i 

pik - ,lp) n A'{(j(|)*' (»!*'’ • • • i r,\ 


[ p<Sf. , Pi ‘ »», 


In the particular case wiicn only pi 0, the* moments pi'd s A.'; ''4 ’ n,; firo«i 
the single factor g(ai , pd are all that arise, hut m th** general caw. it s** riant 
to notice that the quantities s^, , while slatislieidly indei»endeia njiTer** i-a* t«i 

are no longer independent for the comliUonal dtslriltulimi , sr, , », 

lliis completes the formal solution. It remtiins to evahiate jatij , fj , , ^ 

4. The conditional moment pCt, , t, , • * - , t.t. First of all «,< iIk- 

choice of the oomponente of Urn dopondent veetor varial.% applying the 
ot section 2 to such components, that the miiltipUi rorrelwtion H, 

Tth component and the g eoraiiononta of the inderH«ulent variuu* i» hv 


where 


— «n/(oi( H- bu) == o^iTi + a^ra + 


'h , 


+ to a + • • • + t4j«). 





,4 (i*.m that ..f thf wt* uutr that the ii\, 

(is ' I 5‘. iillMwjtiK; j'mj rttuvciiifni'i- ic,, in vary from -- t<i j-) 

f»'i lh« "httkagv 

Jh n< i if ar Jnsh^Mj-m !*< thr variahh's r„ , fl,, tltifini'd liy 
r., w'., • n\-. ■ '■• f 

«,i W,j, 


. 1 3 


Mtl <1,1 H*f4 ft,, , 
a,s MU ft„ *jiji ft,, Cfts (t,, , 


fSti f‘,i ff,> ain • ■ Hn ft,.,, , , 

thv M t * <■„ , h>r n<irma! h',, w.uiid all ht* iiidcitcinIfMil witli Ui^lrilinliuiw; 


ji r x' di-mlattiHfj with /nit'gna'-^ nf fiffiimn, 

» I'J! ' f'lt*’' ’ 'ft, ,d*t,., 

1t - 'f.4 "X l>>t I . 'J, • ■ jt !l " ft, ,i ! ■ ' “ 

Hi ft<-iafrfl Jh»'tt iM*ii'lnar htt Rivi'ii », lint tin* hnkana fut-ttif rcHitllH 

Hi nit vh-vrttani *4 th«‘ x* to a di’Krwjj of iu'iHlnm, ami a liiikaKC 

f:vl<iT for stjo t»,,4islnlint)*,ntt of 


tt Il«*rv 




rr rl|t/t ftln^aj 
f-Vriit»- Kin 5/4 


A KO«n*ji 5 ' i' 0 (j/r,,,i. 


W(> may now , fiavsiiR ohtasiuil the tliftlrihntion tif thi- n. , , noU' tht‘ir grmnt'tri. 
Had intwjittdaiioit. n*t na tlf-noU'" tlu* p miniHnn'tits tif th(‘ «h’jK*iitlt'itt varisitt* 
iti n wwttph' ai»sn'»* hy thi* p vm'lnra ‘ kl*** P 

ortIwgoriaJ rjus»ni«tJ i’omj«»n«*iita t'nrr{*a|KiiuUnK l*» tin* mmplr riuinniral ctim'ltt- 
tians r, 1«* tJaiwtei by tin* p unit vcclttrji x, , x, , ■ ■ • , x„ . Ix‘t tlu* (‘urreafHmding 
aomjamprtta fw tht» iwlejw’iwlant variaU* Ik* tii , j/, . Thf "linkagt^ factor" 
tnc‘tt*ly represente tho aUnwanta* tlurt must Ik* mtulo in thf mutual rolalioms of Ihf 
^-vwiUtrs for th« fwt that wWh* thuy must lit* in the /j-sptu'f of thf x«vfftorB, 



0 
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they really belong to the uriginal f jjjn ; i'f r-iif*- V’ ^ , ■.'".“if, 
coefficients in the equHtioii 

(14) ?i “ -h w',>)Cj r - ■ ' 1 , 

where 

«‘o + ie!i I* ■ ■ ' 4- v'U 

is a x'* with n, anrl not 7?, ilegrcr^s of frmloni. ffwoiiK'W i'n 

? , to be a unit vector, wc have in itlHce of ( J 4 ! 

(15) 4- tt<5*3 4‘ . ' 4 ?i.,pE, , 

with a projection, on the^-space of the y-voetor!*, >4 t|, . fay. 


<i “ anTiyi + a,3rj^i 4- ‘ , 


and hence, aa already noted in the algelwaie tlerivaiion, 
K*. = (^< • (ilVtl “ «Mr» h o'.jTi ‘f • - ' 


where (^ • 0 denotes a scalar imKiurl. The linktsge iafU'i ' » “1 
the ?, vectors in (15) are not indeiM-ntient in t!»e p-«*p»''r< nf fl.e r ihc- 

distribution of their Jnulunl eonligiiriition lieing det^^miro*! Ia !<(■,»*«- 
This interpretation enables us to detmnine the tnMtnen*^, -4 »he dwt.u.lr/.ie n 
p(si I r,). For if correapondinR to (15) we write 


(16) tn M d„yi + 4 . ... 4 , 

then 


(17) 


Si = o[iil?(iri 4 WisiSijrj "(• 




If we are considering cose (u), the relatiuns of the n, to tin* 51, ^ 

will be similar to the relatioiia of the to llie x-veetnu* ii, ^ fjciw* r;M>r - 1) ^ 

however, the m , which repreaent the true enij.ithrisl *4 0 m--. >4 4 

fixed vectors, must remain strictly orthtig<mal u* ear!, 

relation to the y-vectors can vary, Tliia nreatif. that Hie r<d,.,i3r,jij5 ,4 

to the y-vectors are determined by a random rotation of a ri®.! ohI,, 

otq vectors m case (b) . We may note that if in eaw u-,,.! i<,. 

mhmty, the n< would also become rij^dly orthogunjU, m that the 

(b) may conveniently be obtaineil from caa«> (a) by retaimn* the wu*’ .Ijfirshti. 

tion ot the a, , and for the j3( letting n -.4 x . 

Thus in either case the moments of the can lie r 4 ,teitir.l fron, i!?;, i„ *„„„ 

»t« of lor wWfh th^ 

comifie te the requned solution for (,?)'.(.|)'* . . . 

e andcoTfl linkage factor can bt* cxprc.v»4 i« terns. ..f 

•nd fl" t«d * “"I'Adl' r' “•* *- ■ 

c me Utj antt 0,, . j Ih« methsMi w unf«f titiwtely 



f %!. c-C>RHEI..V'a()N 


f**(f rtjRt'I'rainsliy to U* ttf nuy prswiit-al value eM'epI in the case of 

MHC r<*>ts. TSii** ca^c i,-> nuifajiiemi se]tjiriU<-iy hclorc flic t'ciicial case la 

• Sett 'it.K''! !' H*!ii r 


6 . The MM of only one non-zero root. Here wc only u-qinrc a'O nnd a 
r-'inpajaaiiciy winplr ‘‘oluti'ai i- ]s>«,»ilile, the linkages uifhiii flic and n, .•'Cts 
l’< itia \Vc insve m if V' ll»' («»«!'■ Uctween m and (i , nxlicrc 

^5 'a;u> dio projeetjon of in the »/ tpaci*-, iJuit ^ ^ a randoni angle in the f/'.-inict', 
s^jijce !!«' fi,. an<i » 1 h w’^i* ate ittdcjHatdeiit Hence in Ihiin jHirltruhir ro.o «(* may 
eon versa idly wide cj f*‘i SI. jnsi tlic inui*>forfnidion tisc«! let ttltfiiin 

lijf- »if the mnlijjt!'- e•lS!^'l!l^ion Ul . Thus «e may rephn'e llOi l»y 

"P fi-at; » mIs*} < ■ ‘ i ni;,rp,and 


■li 


j < 



~i Jfi' ii-’l- ■ • 

■ eos'" fiti sin ' sin’ ' flu ■ - • , 


w|(‘ie l}»«' evjteclw} value of the trigonoraetiie leriii isevalnaH'd sis 
:!'»«! 4 4- I »' • * ! Ts \j>i 


HHi 


I niinji... ;rqp'hir 

\\V have novi ohtttinerl (lie dintrilmHoii, -•• p,, lli, 

jfilr, (>i (l| /n’r, , p, d< • UH -"'i;;. i"’ 

(11 is given liv ; and in ease fat 

r 


where pir, ■' f>, 
f"'' «! . »! . 




Vi\n 


I'i ( 

I'j 

!'i ^/t i I ti 


rfa, 4 li’ 

raiMd 


Ssnd m ea,M> d*' 

f "I’sj, , sit, • ■ ■ > 

where jq «fj » • ■ 
all «'» from fHo f , 

geometrie fnnrtion. 


o, nja d-MnjtdnV/’J ftfrint f JT 

' 4- /! L HiMt,: . 

5 «5,i- tieiioted Ity/. Hw.-ai/-- deludes hUminntinn tif 
'{‘he solution in either ease eoiifains a generaliasl hyiicr- 
If v\e delude (lie general series 

V(ii, -f* a/irp 


j'rta, f nrftr- 4' h ivriirir.i 
‘r ' nwsin'ttd I’sr, i tn'(r. pofi 

Y /rf'flr } ii rdVH'frj) A 
^**’■“•"'1 rtnj nr, > orirj 4- H M 


rw, + 

r(d,)H,i , 


i 


»>y 


FCtti , erj ; A , 1 

Fia-, A , A I 


f » r, ^ fa 5 x, , * 


, A n , fi ; /i , Xi 
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; 


respeclivply (wT |8, i>. .'Wl 2:? i ■>, 

»(f, I /»i / fl' fi. »i '5 "' 

( 19 ) 

XF(\ii, Jn: *2, H, • ,5, )p, Iv ■ r '■ 

and in case (b) 


(20) 


p(.r, i iJi 5** (h - pir, : A ' • 

XA’(|n;ii, ■■ If. K* 


■ I." ■ ,1- ; 


. I 


An alternative (iiinmlirtnnl form »a '.)■ »< i‘r '■nvi, • ! 

given / =s Ui + «3 f }|^ i|. f,;> u 4 ,n-t .,)>!?<• , .x-ft , r; 

fr-i - ' 

whore for dermiU'iioKK \v(* < w 1!^ i ^ -i « < , * /. 

I 'i«i 'I ' rt« ' i ' , I J 1 

WG have 

F{^n, In; |, |, < • • , ^p, Jij; ^ 

(21) 

• h,, \ p. f If 

where 0 denotes the tiiteralion «f taking Uj<' inw i.'Sriji; nJ j > s". ,/'*.(! 
possibly be done by miiUipIie«,tion by j ’ <nvj n <4 i« amij.) v « .j j,, , ,i 

mtegi'al but in the use of this forauila In-rr ih*- «.jM-raSv.in m . . 1^ . ,A , , ? 

directly) . 

It 18 of some interest to examine a Mtiipje ow. ni.d, , 1 . - Pr . ^ 1? 

/ Pin'! Ad '« 1. 

Jt, 

If we take p = 2, g = 3, we obtain for p(r| , r| , ^ ^ - « ^k<i 

iin - 2)(n - 3)(a - 4)(l - rj)*'’ '*0 , ,,.v 

entaiy cason = 0, wc obtain on mtegratimi wf r| from H i« 


Vih 1 Pi) » 6(r|)’ dr]{l ™ p\f r,, r 

^ * I riai 


. 


L rf3) J 

. i‘i + Owl <] 1 ' 

I’ll + ij I'i fit's fiajttef 4 2; 4; ' 

where ( = n. + w, . How from the identity (1 » ,,, . ^ ^ 
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9 


< ;; . (I , jj, left-hand .side. Tliirf 

I .j ' 1< 4» ’ "*)* j«l< tdsty. i»‘i dl f • 0. 

V'\ ^ i . ^ r(| 4 / - 1 - I) r(i+f-l- 2 ) 

F’* 4 2 : ' -f‘ 1 )! “" lY^KH- 21 ! 

. F 3 )rf| + 1 ) 
ri§>r(t + d) ' 


M ■ \ ( H,) a iv.^ f u 

} an ■“ a ivniat + .Si' 


li'! 


?' ' . ('- ' >'• ' d’ j 5 


in 


v* ^ 

‘ If, 


m d- 0 a -i a) , , , 

1 J.,|J ' ■! • f)l * ll 


l' d 


'2 5 d' -i) 

»<•, d >f« 5 d Vlr? ; '< f) di 1 — /<; i-j 1 ’ ! . 

4"'* jM, ijiiH' finni'-uratiointf li frtim (ilo 1 . In pui’cly iilKt'-hraie 

i i‘f ,1 

i 2 J’ 


?■■' f*r' .! t iwid.rl'' «/y, >1 pin**. 

, ”i. tl'ijj ; ! < ^4 hiunn!,**- »2l 'i, w** have fnr tiiri Kittue i!nw /i *- 2, 
V .5, h ih'” >i .ii-m 

'2:. 'd -rr! .r/? » ri; , 413; i 

In'*"'}?* .ii’iiii? "n n !«» f'i , wiMditmn 

tj'2)t ikl - pi/is:)* ■ Jl\ 


‘2'4i fi,’ ' <*Fi .'} a,,,,.* ' '1 ' i' 

• in ■ 


*1^1 1 

1 hM !!•«' F't ih<‘ irratuntal rxidOHiMn »1 ■■ pirjzr fimr(>lj<, 

ni.'i Ifjfi- If.f.* '* «»■! itntu ' neh jMi'h-nt ff s, wr ulitiun tin* dbtrihuthm jalrj ; pd 
sn '254' <.T i23 1»', M|f the aj»jir<>pri.'iJe Kmrs't, Wi* may rurlVitT 

Hdr-gf .'<!*' ♦IsmSiv sh*’ t %j^r« .sletvr ttjjji rfi'irt Ui r'l , atul after tliwarding 

vii Ir’^i .irt* t< THSfi «<• tdiU am 

4 i 3 !| jfjj s ini'' Id. a, j J » }11 ‘ Pl») fi 

I 2 <pi 2 r I 

tt. j»fl* fCiwIily sm'wwitiwf t*» 5.K' unity. 

it More tfew eme iwii-xtro mot. ht tin* raw ilw fat-tor multiplymK 

I fit ft} b ratlifj miwtrkaSrfif tn Wng »yminvtri(-al in iwih the set ri and 
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the set p, . Mn increasef!, the (’imverpriirr' ><{ rj i.-- , -j <»• , »''>■ 

Pi are also arranged in de^eeinliriK uril'T •<! (''Xit; flif. 

restriction ri > rj > > r?. The Hminng d'N«?ih-s‘.j' r I-' 'h 

by Hsu 19], 

In ■view of the algehraie diflindty «f uhLsirnrig pt'S- , , .'j - i 

integi’ation, an unpjTumetrie iiiefhwl i(f ulifainsna, if « »i5fn.<^T ’*/ ' t j--; 

Tliis is fairly trae.tahle. in the ea.>*»* nf two Jn^n v '1 -- - }.‘i *•<* 

of the original variables is tran.Hformt'fJ by un itjI ‘-.nv*' n - uJt 

that the first new varinlile nf the sf-j'i.iid fK-f ik d< 'non'-) b--, 'I <- ?,< si 

between ttfi; and uij . IVe may write, fi.r e\«itii|4e. 

tt’si = (a’nit'n + u'isK'k Hf' •*"' t«}| i tr': * • 


(28) 


lOja — 




-Wi,(u)ij + - .• H- JPsr'tr?, 


W?| 


•j 1/^11 t 


fioti + ■ * • 4- wjp'liier V 
-li'iaftef, HI t- (cJ^nej, 


miL 


♦ M ,)t' 


f(ti'i 2 Hh * • • Hh ie|j.)'n'i« 


t 


«rn 


which conversely we can at mice exprew ami rehiriiin *-1 Jb< *. , i.i/Ui.f i «l,f 

tOa/i (since the reciprocal of an orthogimal niHlrix fimip’v j»'i ij.'tp. -.r 
we write 


«si *= tOai/KtCai)^ + (icmI’ ’t *• * » 
aj'i = U)m/[(u. 4 i)^ 4 4 ■ b 


arid write further 


aj — cos 5ii I Oj — cos <li3 , • ■ ■ ,bt ens , 
bi = cos fljj , • ■ • , where oji =• ctw gj, , «r^j nin #1, «•<«> , 

we have in particular 

021 = ttibi - bi ^('1 '2 ^ 4 )^ 

(30) aia « a^bi 4 y/^ 


^ V (1 - rtit \ 4 I .. f 4 \ a 


where the distribution of the a’s and 6 'a is proiwrtinntd t« 

1(1 - a 5 )><-'> dai] 1(1 - do,) ... 1(1 - „ j ^ , 
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For tlifi rpawtns rliMUnml in Sf'ctiim -1, it will Hp notipiifl flint only flip dihtrilni- 
timi (if hi in tlip a, b net i« affcH'lPd liy Hip linkagp factor. By wich mi'lhrKls the 
f‘xi)rf'“wi(>nh 

m(1 , 1) =’-■ KUUl i r<l . m( 2. 1) nr I r.l 

were fairly readily nhtained. If we. introdure the iintatifin 

.S'* M t, X*« « Z irWi)\ 

ttnl ifif 

und iihii syrnhol}! for the jirednefK of the « and 8 momenta, viz. 

^ I'.'ioiioiai /i'ldiidBl* ^ ^ A'lniiOtss! 

fti'., \\{‘ may liat the momenta nth i h, ■ ,tp) uk in Apiiendix t, wliieh Rives all 
ni'imi-nta lip fo iln* fum fh order in ferin.a of fhe a and .# moment.^ Uhe nmnerieal 
eoeihcieiif^ iiii'-e from fhe numhei>' of way.s of forniiiiR fhe twoavay partstion.Hj. 
'ilislf feet’ll *•" eoi re-pimdinR to the o inonientHare li.sled in Appendix 11 siRuinst 
fiieir niipioiniate ryniliol, the eoinspondinp faefor.s eoininp; from tlie 8 moments 
Uiiim ohtauied in en-e fa< liy wrifiiiR q for ]i and in etiM* (In By writing also* 
a > . Tlnj'- in e.i -e '';n 


a ' 1 , It 


CW 


and in eawe dn' 


n ‘ 

l 2 

n d 

1 


isp^p 

t 2«. 

Lm/i'y 4- 2(J 



tip 

}■ a - 2 1 

uq 1* n 2 


Jipi p 

i 2pp 

■ ■ 1 ) J 

jtqiij 4“ 2)((/ 1 1. 


(„ p) 

- (» - q) Ti 

_a/np -f- 2Kp ' I), 

jKtiq -f 2!if/ ■■ I ijj 


ro' I . I ' 




r » . 2 ■ 

J 

1, a/i p * 2<_ 

jHq !• 2j_ 


jnp d' M ~2)(q + ll + ‘ifrt “• pA oc* 
l«/e'p d- iih p “ liq(q + 2)(f/ -- “ 


By mean™ oi iln* lumMoimatton dint it. is postdlile to devi*lu|) the moments 
nils , h' ill till' erw of tMo non aero roots, IhoHRh in ohlaininR the resnUs t|iiatisl 
in ApjMii'lix 11, where the loimnln for aid, l( and p(2, 2i are inehulwl, it \vm 
found eoinement to supplement this meth<sl with llu* devieeh meiitionwl in the 


*ll }.< ri*!ie's(i!rt"o <! itiK' we I»»ve w>H«(a«’.| p < q. If p > q, we inlerelmage 

sle* «!rj«-},4<-5a ia«"t uelt p''n4<'m ve. Hur v«ri«ie», hud heiire mutt inn*n*l»uge psoid^fin 
Jtiri,. «(.!«, eiis fi«m*nlsie. p ‘■'v e.ifrespiimlHig Oi ilu* iiulejs'iirU’iit vsrinli*. 
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next section- In t!i<‘ ‘-t--*’ “i JJem *,’ao n- n e* < 
powibU' to cfirry mit :i fnrUu-r ‘’t* "I" ’*» 

"partial” viiriuP"'' 5 - 1,7. . '.ib<-Te 


, -• !;,r V;iy 

”4 s-f i i 2'tt, i ' I )j ? 




'■'.a ''■)! 


iw ('(*enicii*ntti. 'I’lns riiabbs Ur- f** ivpto-i' '•>> n- “'Wi* ♦-i , •*' 

whicii tlir i(« rt'lnt*-*! 1 m lb<> pailin! r''r« bi<v-M • •!,. ,-,'’1 ' ' *; m , 

i.G. to the scf'iiiu! I'orrelatiMn be'S^r an ju* h jeP* • a ihi ‘ . .iu , ’>' ■ i* 

'I’hiK metlifMi is, however, isji.iin l<!<i ouuJ-eT, , j|,< 1- * i sm j< 1 . .jJ"’. -> . e 

rapid metliod of evsdiialiiijt Jje'h . h . .■'? ‘nff a* ^l 4 e, -b ".r/i'Mp- | j,: j.ao'i P u, 

lias not bmi entirely ‘‘■olved im tlie .ejiicj',' ./it-J-i- ’ I'-r^ jS< 1!.’'- 5 oeh 

ill the eonehidiiijs M'etiMji me sueiitioied fh ve < < /'.'li.vh! ''wf,,!. 

and whieh etiahl(*<i (lie j«n tPc ji iji.ottsiM.: h ' ii< -o,.;.* <, 1 

to Ik* (‘Minph'Pil atui add<*<l ^o Attjfeiidii. il 


7 . Relations amon& the « -moments, I - .• ■•. V, • ',"/i s ■ 

?, beiiiR random 'leciui-, nt Hue p fp.e<' Mj lin 1 ■■ *' > t ' i' - , u- >' d 

foufignral ion IrtiiK detenmned S<y * 3 i>' -i '-i''", -• ' 

provide relations riuiMiifj the -« nf - 'iSv' m, ,,"M' , t.. -i , 3, 

mil M.! “ '*,j ' ' .1 , 1, '. 1, „* , j.i, 

the eorreliition of anv aph a PX'd toiM.. 35, «'< , ■ ■'i , „ ny 

Ui i X3I. \/2. i** a random enrrel.iJiMii !iJ /I* **)■> o , ,,j., 

5 , with any olhei is u tmidoru e,iii»i-S;ji(oit bj» 1, 'I Ir >< ,,<' .4 thi- ,< t..', 

is iHWt illiistniUti by an esnmple and - MiJiMnnf m-'i di .-i.e,!,* thi 

Rix rt-inomeuiH nainireil for iiH, 1, 1 1 will l•ed«•taMd 
For eotivenieiif'e. tlenole the leipiired (jjean labi* *4 

ftllftailtjl t Otiahoji , <*1|0?,WSJ . ,«»)..« - 1, .I'-t 1- ,i,t 

by . 1 , /f, f\ 1 ), K, /•' tes|H*<'live!y. MiillipK the u-, ..nd ojij, 1 /j-. „t, .r.,d 
auttaa , aii«ist)f3i«£j by expreasitoi dPli fm a d, mee »Ih7 t, j-j, jaM, ,ai; y 

unity, the conminent niean values two inuilleml 1 hi» g, !'■><• 1 1!.-- isjo <• 1, ’ 


-•I -I- Ip “• \iH IH • 2i Jnp'p f 2 i, 

,„g, A -b dtp - ])Ii 4 ip - Ipp - 2 'f’ 1 ft, 

A + (p — lUi i 'dtp I op 2 il> 

i 'p f'‘‘p 2 i P J I 1 

The moment vl i« the mean of the triple jif»ahi. • -d iho .4.3 3,a,u,|''ii» «’• 

of and with Xi . The anme vnlrn* imistt bi* fei»l(r)»'<t wm!j iit.% ; to-rej 

vector in the p-Hpace.e.g.ttith either IX, i V 2**1 tvph , t, > ~ t,' 

/V p. Thin Kivea two relations 


( 30 ) 


.1 *• n 4 1 ) n 

ip + ipl - Zh - I2f) - tp 


2> ‘t ' * <1 Hi< I m 



I A\t)Xl<‘Ali t'OHUJ.I. n IO\‘ 


m 


A fitui! Imrarly iriflfjffiuifnll relation ih ftlitaiueil fiKiii flo* mean triple prednel ftf 
■ ^ 3 >, ■ ?»*! ■ Ith wliirh ciejieiula w»lely 'Oi the infernal (atiifij^instiiin of 

anr! , ami i« easily ahown (c.g. rlaMt^r 1«* ntinritle wifh one rtf the oriRina! 
axeartf the a-spaeei to 1 m» 1 n^. Tins given 

l.'f?) pA + h/h;) l!/t4 l!'/t 2'if t 

The eqnatimis enn1aiii«‘<i in fHfiJ, KWi aiui fh"! (h-lemnne .1, /I, If, K ant!/'. 

Similar ernuitinnHemihl evirlelifly be ennnfriu'tfnl fur the higheruirder moment#*, 
e.g. fur the terms jeii!ii)c«l ftti ^(2, 1, Iturnn.l, 1,1 .buttin' nninlaTs uf xnrh 
tr*rni,^ iiit'ie:e(- lajiully. Fiuni Ajtiienilix 1 it will be !*een llial there are 2t 
di.stiiH't te) m.'> in j#' 2, 1 , I ami lllin 1 , 1 , 1 , 1 . 
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ON THE THEORY OF MARKC^FF CHAINS 

llT rXUOTT V» , 
fnirrmiu n/ 

1. SumnuitT^ Alliuninh j.''* TA^-sif « r* “.l-f i 

prabability nf infir*jH>n(b’t»< swS jmmtcjI'jI tic Ir/n (•.<-•" b '/if ''.•xi'i. ‘U u".' 

for tlio amJyfsw of nioioii of tlio jcof.ifjrt' sti il,’" I'f. i, Cf.f *] o. -if 

Iirobabiliiy of I'lr'iit-i hm- !'<'■« j< 1 lif b;ri '/■■'Aii'f/i 

invMlif^ftiiniiH in ths.*’ «ubjf<-t wn*' jtijbb‘*l,nn,rt \ f5 '3 > Is.*- / •'V ii-, '2 

haa exti'iuli'd liio fninbHiiifntal Sunsl •bo-.fu'jo'' ’«■ /tiioi.-r of «]{<] t'h >i:<“ 

The mrt»* rxtenptvp t'\||>or.i(ioii of sbj- 1 io!>i Ko-. !.r< 5 j j's i./<' I A Irf.’!*; f 

In the prwnl papor \\r “i I-.t. ■! 

(■haloB ttf ilcpemiHit viniabh*. .'lud f nd tbc p.r. ■;/*, *,!P y s, ■; 

functions. It will ii<" <hs»t f'^r < ‘ ,1 f . , fl > ? ,’Mil 

flistrlliution fijiietioii'i ran in i« ij«*i > i si r b-/ «. n f, 

vertorw of a wtain ojHTai^f r^jnjpiojt \|;iny oj *J,,. .,.,,1 p, ,, 

have lK>etuipplirti to j.roi.h-jn* sn -4 }3,< ■ Li.sio M 3 .'.,-,;' i ' , i 

imiKirtant appliration h!(?i ni,»4<’ b> I !*' -i •■''j'lr: ^ 'i/i 

ton tile hiisift of a Biin]<!i<i«''(i inn*>i -‘larj,', , t.;,) ' * n * 

solal with t'(»oiH*rutiv<< element- R-ebi s»< .♦ ^-bafw S'l oj*ss-'.f) t * < P* f-* » 'sp.,, 
application of tinr-nr ttptTfpor llieoty 'jbvnj:bj,nSTjui*an4n4<->!va‘;K'p. -p. “.h 
probability ehaitia ban apparently I w»t*n made In- Ibni'jtrAy [nj 

2. Introductory Remarka* tUeir** a rhsm *4 ■*'»' p «-f 

which might lead to one of e fapwiUle twUm, nra! sHheh are sn j.\.f b a 

manner that the probability «if m ir>vent*i li'.-jK|>nsj a i < i Tr,! 

*1 1 ‘ I «» 

is proportional to 


The probability of a given fnncUfmF(a-,.«s. I havii^ a I 

mg to the sequence of et’a would Ire pmiwrtional to 



where 


(la) (F(«i » < • . * 
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TOr! thp siiminalittn exUmds nvor iill valnp« ttf 


{«j| “a (oTi , aj , • • ■ , ««!. 

Thf! prohahility of a riwult ta of tho fiirat, ovont loadinK to a romilt «, of tht! 
nth fvpnt is 

( 2 ) , «*.) fl/Fa) 22 I «a I ' • ‘ I «J. 

In onlcr to fiiul thp. prolmliilify of a givon fnnclion ,«„) having a 

Viilsu’ ltot\vr'('U i and I h it in nwfiil !<• know Ihn mnnumts and Tlnek' wmi- 
iiivi!)ia»tt“ of Fhii , ■■■ Hath of thnw functions of Fean be calcnlated 

fioni 

a s 22 ,••■,««) exp l^fFCai , . 

iO,i 

t tltvioiidy 

F„ Vuu 

S‘^0 

It ih ktto'AJi that the mth 'ndeh' .-cinidnvnriant is given by 
(.r,,) lini fl” log 

In the notation of (’lani^r -- /(«), tlie cIuiraeteriKtie fiuielion of F. 

It IK, rlrhiii><l >o that (t(^ } h) -■ ffl^) in the probuinlity that the function 
Fu«{ , • , o„ I ha * a value between | < F(«i , • • “ , < f d* ^b then it i« wdl 

htecAji th-.P ir>} if f»';t is contiimous ul x ^ and x - f -p h 

1 fl — r 

P*i U'il i it j ■ <Uil „ liin / eAl)[log/(w)l(f« 

*»1Tf V J* W 


wheje 


ffl 5 


W hen the derivative of b'tfl vvilh respeet to ^ exists, the probubility of 

f* (o| f * * ' I Wb) 


having a value U-twiva f and $ b rf? is 

(fibl i?«|? «l| « Urn [ vx[> ,\m{iuT/m^e~'’*^da. 

*ir }“ ) 

From Cd# 


(71 


S A«.(i»r/w5 


-log 5 :,(()) + Ilm c”*“*’*“ logX«(a;). 

«r“*0 





20 


A'. Mnnjiion 


Sinw, f<<r ;i rMiiMiiut r y. 

we have 

i: A n'!w: ** fri' I'm A A''!-. 

jiTiii front <n, 

10') (7{c- + /») - J hut I r A 'V' A 

KiinHliuitf >':{j, '•!), fjCj tapi<h iuh'-Mu-s’s-'U •'!,!' fj.tj i 

fhain uf furrclaht! I'VritfHnni Iwfithuntmi'i fr-iit a <.f /„ ^ , j. j;; 

now inlrtKiiici* jnor'iitnrf''* hit tj » f /„ ■ / jot ji' *. 

(if fMai . • ' ■ , a„i. 

AVhi'ii « ih a funtinttuti'^ \jiTi:iiU|c. ill'- T' - .o!- I'i),- .j - i,. u m-i “i . >,. « (, vi . ■, 
strt* PHwly K'Micrnliy/'ii Uy ri'i'lswsnm *»,«• !<)«un"(,i5 V'U' "'i" r.'*' ', ,".if 
(if th(' c/a by iutcjiirjt!''. aii'i hy Jcjii !hr i -li . ■'< ih. j., i« ,, 
Iiy integral fmiattuiii* 


•» I 

3 * Simple Chatngf ^ ^ ^ J jiiiu , ir , - ■ 

■—1 

a. (rVtiertU fWj/. Uv a fiin].!.* .-hatii m.- vhrtii nu .(u ■- .j . 

niC’h (if whit'll (wuIh h< uhc <>f V t« luU'- asui v, hu h '"'I (jj- <i( (i (| jt. 1,1,11,, 

that if the ri*KiiU (if llie Aih twt«nt it* «* , l!««* iirnhubilijv of shi' I -• j ot,, 
jatilcling^a rwulf o,,, to ttn<)e»rlhma! t»< , «*,; rhi^, Mtavhi ''*1 ,» * 1 .^ 

proltahilityof tlie urriirrt'iicettf tlic'($r'r|iti«nt'(*<>f rt'suli.i 


«1 » «3 , 


m 


il p ^<^> . “i « « / E n r''*. 

and the jirtihabiUty of a first rmiU ttj , h'adiiig ?«• an wth reisoU .*« »’ 

The mmmatirijiH art' to Ihj extended over all r lewiUle value. ,4 t-m ,, ju.h* , 4.4 
V imlkoH. Clnum. ,if , 1 .,^ lym- 1,1., r^llrd 

Markoff ehauifl uftar tlif Imt author who ntudiwl tiu^m !*v,(U'rtwi« wlh, 

I-rom (I), the average vidiw of a funeiitMi l>'(u , . • • , „j , ,« 


(12) 


W, 


2:--i:ii /Ar,. 

**■1 ttgf iAwl 


X * LFCn,. •••.rrj 

"' " ,».ri 


1 o, ji ! 
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-Many <’hain fiiiirtintiH F«ai , • • • , r»«; rtf an* I'ltluT addjti’, c ‘ff lii’ihij'ln'j 

tivc ar»l tif nnc nf llu' fnmih 

r, irj.'l) a| I'llat * • * • t * hiflj , iiin2 , -j- . . . I A'ft,, I , A., . 

(I.'dtJ l)( Fiiai , • • - V a«,) J/fAl , «T?) il(»7 , ■ ■ ' r/'Ax -« t 

In ™«> (It) it i-^ ntiivt'iiinnt fit tlHim* :t now fmn'tntn Met, , rt,- liy 

(M) r/ttv, ,(t,i r'\|i].r/dfr, , fj-fj 

and in Imlh oa >i‘< tn ctuii'iiit'r a fiinolinii ff lh<* f«trni 


n 1 


1, 1 .1 i 
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iKf, 

. ' ^ 
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satiafy the operator equations 

(20a) ‘h.,* • Pr 

(20b) P-: ‘ ^ 

where \i,, is the ith charafiteristic value of (I7), then 

P 

(«) - 0 when i 'f- j. 

W"*! 

Wo shall for converiienec always aK-inne tliat the ip's anti nt-V are nontintli^edl ; 


<h . ■ +... “ I 


so that in general: 

I) when ! -lA J 

(21) - 5., - , . . . 

I wlien 1 J. 

It is well known from matrix tlieiwy that one can expand a nirtf rix eh-ment w 

(22) p,{a, jS) = 2 

.~i 

and that 


(23) X<,, « 4>.,jt • J®* • qq,, . 

By substituting (22) into the oxiirewinn for l?„(j*) in terms «»f P," *, one ejm 
show that 

Zn{x) <= 2) {X(,*r* ' fe W.*((?)l (B \{'o«(o')1 

^24) O'-* ) '.“•‘i 1 

« rx?.;‘(<h....i)(i.qqj. 


Therefore Z„ (x) can bo determined from a knowledge of the characterwUe veclore 
and values of the matrix P* . 

If there exists a largest characteristic root X<,r such that 

Xt.i > i X(,, 1 if i L, 

one can obtain some interesting results. Before deriving thwe, we shah give a 
sufficient condition (which is satisfied in many chains) for the exiatenw of tint 
inequality, Fiobenius [11] has shoivn tliat if all the elementa of ft finite matrix 
are )> 0, then the characteristic value of largest absolute value of the matrix la 
real, positive, and sunple (nondegenorato). Thus, as long as r is finite nnd 
p„(a, 0) >0 for all a and 0, (26) is valid. 

We shall now prove that 


( 26 a) 


lira 

«-»00 



Zn(x) 

(*r;,*-l)(l‘'i^6,.) 


4 


a 0 
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that is, 

(25b) Z„{x) • l)(l • 'I'll,*). 


First let us consider the case in which /% is a symmetrical matrix. Then 
“ 'PiAA) the characteristic values are real, and 

* ^(*) = 1)^ + 2 x?;’(-i>t,.- 1)‘. 

From Cauchy's inequality and (21) 


Therefui'o, 


r = 


£ vtAa) 


< 


2 V’lx(a) 


V “ 

2i = 

.*■^1 - 


2 x"r‘(v'.,x- 1)= < r 1 2 x::r^ < ,(u - 1) | x.-::* | 


where X,,t is the; characteristic value of Pr second hngest in absolute value. 
This inequality yields 


(25e) 


Z„M 


i'(i’ - 1) 




/\ \ n- 1 

/ Aj,a \ 

\X;y,B/ 


and (2r)al (since X^.x/X;,,* < 1) follow.-. When Pi is not .symmetrical, one can 
easily derive the analogous exiiression 


1 Z„(x) 

lX?r*'(cK,*-l)(l-'n,.) 


1 ~ 


whore 


.1 = [max II (fl',.x • n i;||imix |1 (1 ■ 'I';,x)l)) 

For brevity, when ,r == 0, we. write X,., n,- X. , 'Iq.x as 'Iq and as . By 
summing (10) over all a’s exeeiit at , fri and a„ wc obtain the probability of an 
intormeiliate event leailing to a result tn it t.lie results of the first and last events 
are known to have been «i and a„ )\'iili the aid of (21) and (22) it is easy to 
show that this probability is exactiv: 


(26) 


2 Xr'’‘X^"Vi^«i)v’.(a0^i(ajt)ip;(ar.) 





2 
«l «n 


When n is very large, anti when we have simultaneously n >> k>> I, wo can 
rewrite this equation to inelude Xt , and neglent all terms containing other i's and 
j’b. This leads to the results 

a) If the number of events, a, in a sim|)le. chain is very large, the proliability 
P„(ak) of a fcth event far removed from the first and the last, yielding a 
result a* wlien cn , and «« arc unspt>eili(id is 


PActk) ’/'&(“*) v>i.(m) / (^z. * 1)(1 • 'I't,). 


(27) 
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b) When k - n, the probability of the result on ■ of the first event leMing to 
the result a„ of the ntli on’ent is 


( 28 a) 


So, as n -> M 
(28b) 


Fniotl , On) 


1-1 

|WL 


^ n(^l » ^n) 


'^L(«l)l#>t(ttn) 

(«J>/..l)(l.<ht)’ 


c) When there exists no knowledge eoneerning the result of the first event, the 
probability of the rrth event yielding the result a« is 

( 29 ) Pn(o„) = £ Pn(ai, «n) ~ 4 't(an)/(l‘ 4 >/.). 


In chains of sufficient le.ngth for (25) to be valid, the prolialjilify of 


P(at ,•••,««) 


having a value between t and f + It has an eaijccially sinnple asymplotie fttrm. 
From (6) this probability is (if fetr a given n wc let T ~ an^) 

G(i + h) - OQ) = ~ lim r e 

27rt a-*oo J-on*/* \ W / 

(l ~ e-^") exp A, - + . . .| 

and from (25) and (5) 

(31) A„ ^ n lim d” log 'Kt.^/dx'^ = nf/„ 

*-*0 

if 

(32) L,„ s lim a” log >>L,t/dx”'. 

Letting y = (30) becomes 


rdy 


(33) 

^ 1 

47rt o-*«» a y 



where 


(34a) 

fti = a - Ai)/n* 

Ma = (f + A — AO/n* 

(34b) 

Ai = average value of P(aj, ,■••,««) 


P. 


L^y^i , 
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Integi-ating (33) 

(35) (?(^ + h)^ (?(f) - + 0{l/n)] d^. 

As n — > 00 and h 0 

(35a) G(J + /i) - G*(f) ~ exp ( -i)[f - i?]/nL,), 

and the probaliility that ^ < F < ^ + h becomes Cjaussian. 

b. Examples of a simple, chain, As an example of a simple Markoff chain Jet 
UH consider an event which can learl to either of two possible results, say “ — 1” 
or “1”, I‘'urthei’, lot us suiiposo that the probability of a given result being 
followed by an identical one is p and by one of another type is (1 — p) ; that is, 

7>(-l, -1) - P(l, 1) = P 

p(-i, 1) = p(l, -1) = 1 - p. 

This chain would be encountered in an analysis of a sequence of tosses of a 
coin with a ‘'memory” so that the probability of two successive tosses showing 
the same face of the coin would be p and that of showing opposite faces (1 — p) . 

A question one might ask concerning such a chain is— What is the probability 
of the occurrence of a given number of transitions from one kind of result to 
another? In the chain of results 

- 1 , - 1 , - 1 , 1 , 1 , - 1 , 1 , - 1 , - 1 , -1 

there would be four transitions, one corresponding to each — 1 followed by a 1 
and to each 1 followed by a — 1 , The function giving the number of transitions 
in a sequence of n events is 

(36) F(ai ,<•■,««)= 2 h(ai , o<+i) 

i-L 

whore 

/i(-l, -1) = /i(l, 1) = 0 
/t(-l, 1) = h(l, -1) = 1. 

Even though the a’s arc dependent, in tliis special case, /i(a,, a,+i) and 
h(.a^i, ai+i) are independent so that (40) could have been obtained on this basis. 

To apply the methods doscilbcd in the beginning of this section we must find 
the characteristic values and vectors of the matrix 


(37) 




/ p (1 - p)e®\ 
\(1 - p)e‘ p / 


<(the configuration index a has the value either —1 or 1 in this case instead of 
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“1” and “2” as given in (L7)). Tin* i‘harant^>ratic values lurf the roote of the 


equation 

1 p — X (l ■ p)r^ 

- 0 

1 (1 - p)P p ~ X , 

that is, 


(38) 

X|,* = yj 1- (1 - 

1 Xj,, 1 = 1/1 - f1 — p)r’‘ 1 < Xi.. 

and the choracteriatic vectors are 


iux = 2^ mill ^ 2 .* == 2‘^ . 


The and <p vectors have the same i"imiiuineiit>< in thi** eatie lienuiw itf the sjnn 
metry of the P* mati-ix. Clearly 

X;, = Xi = Xi,ci i; Xs ™ Xa,i,_ =» 2p — I 

= 2 * mill ~~ -“O' • 2 ^ 

From (20) wc see that if the result nf the first event in the eliaiii i« «, , and 
tliat of the nth event is «„ , the pnilmliility of ilie A-th event yieldinpc the rmilt 
a* is 

[(2p - 1)‘ \j_<xh + nil + (2/1 » D" 

2tr+“(2/i 

As k, Til and (n — k) simultaneously get \'ery large, /’,(»() -n,' J), itirh'peudently 
of oih . 

The probability of an initial result .t, leiuling U) ti limd result a, is (fnmi 28 b) 
Pn(ai , Otn) = (i) {1 + (2'/l - 1)" ’«!«„( 

80 that 


P„( 1, 1) = P„(-l, -u « (i) (1 + (2p ~ l)" ‘j 

P„(-l, 1) - P„( 1,-1) = (i) 11 - (2p - D" 'j. 

Now, to aiiswer our original quosliou regarding Iho probahility distribiititm 
of the transition function (30) 

P(«i .•'•,«,) « £ h(a. , afu). 

1»«l 

W 0 use the expression for Zn(x) detormiued from (24) 

(39) Z,(x) = 2b + (1 - v)e 
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Itoiii (9^ tlu' pvubiilMlity of thori' lit'iiiji l»('h\vopn | and f transition.s in a 
seqncni'e of a + I pvonts i.s 

Gil 4- h) - Gil) •--- j-. J - '"•^)|p + (1 ^ p)e'“}’‘ da>/oi 

^TTl 

(40) 

„ -L •“( _ y ?t.l(l ~ Vtp"~ 

27n . -«■ j.„o (ti — /c) I fc 1 

Letting X = ami unuranging 


G(l + h) - Gil) 


I fnli\ - 7 ,)V~* j. 
TT^i (r< “ /.’)I/Cl ” 


(!+?({ + «). 


where D(\) is the Diriehlet integral 


X J-«e 


1 r* Kill X cos Ax 




0 if I A i > 1 

^ 1X1 = 1 

1 |X1 < 1. 


Wo therefore have, when •}* A] < n 
(41) G(l 4 - Gil) = 


^Nifc ^n-fc 

p) P 
/c)i)cr 


Here (x) demitea ttie greattMt integer mo, exceeding x. The sum isi zero if 
[J + h] < (f -f 1 ji \\ hen (f +■ /i| > n 


(43) 


(711 4 /«) ~ Oik) 


T 

k\in'-k)\ ' 


When n is large it is iHflicult to get a clear picture of the function G(f) from 
(41) and (42), so wr sliiill develop imyniptotic results for large n by jising (6) 
instead of (0). 

By employing (S|, we. see that (this sectinn will be developed on the basis of 
n 4 1 trials instead of n) 


.Vi Of p ^ a(! p) 

Aj i!-'- »p(l "• /d 

Aj npil "* /0(3p - 1) etc. 


Therefore, from (til 


AG G(| 4 h) - Oik) 


^ f 

2ri 4 






exp l~-^np(l ~ p)i/ ~ mp(l — p)(2p — 1)6>V6 — • • •] dw. 
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LeUing u = wn.*, wo have 

^ f j-g- tuft -Ai)/n! ^ «Mf{*A A)' fllj 

2Tn i-« n 

fi _ - 7J)(2p ~ Du* . g (u*\ 

L (’w* \u/ 




'P^ 


du 


where 


Mi = (I + “ Ai)/n* 

M 2 = {? — Ai)/n*. 


Since 


f“e ““’c ''“dM ' 

J-‘<C 

i r 7iv ““’c du 

J-w 


5 (ir/a)* oxp (— X“/ ta) 


we have for largo ii 


A(? 


(43a) 


[27rp(l - p)]' 


pM} 

f p) 

A Bill J-., 




(■ 


2;)(1 — 7j)n* 

Aa n 00 and h — » 0, thin becomes 

h exp (-[f - Pf/^pD ~ V)u 
[2Trnp(i - 7j)]‘ 


3p(L - p) 




dEX. 


(?(? + h) - G(?) 


i“S-l - 


(43b) 


_ (2p l)(f 1 

r 2/;(l - p)n 


(i))- 


A similar problem which occurs in statistics of high polymers can be slatwl 
abstractly as follows. Suppose there exists a sequence of evants eaoli of which 
leads to a translation of length o of a point cither to tlie right or to the left, and 
that the probability of a translation continuing in the same direction (is ita 
predecessor is p wliile that of changing its diroclion is (I — p). AfU'j a trans- 
lations what is the probability of a point being displaced a clistanee | from its 
Origin. 

If “-1” represents a translation to the loft and ''+1” a translntion to Urn 
right, 


p(-l) -1) = p(l, 1) = p 

p(-l, 1) = p(l, -1) = (1 ~ p) 
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'Tlu' fuiictinri RiviiiK lliR cliHlaucc' of the point from its origin after n displacements 
is (wlien « -=• -ir 1 ) 

F(ai I ' • ' > <*s) ~ + ^*(“1 )«!) + '*'+ h(«ii-i ) ®ti) "h 

/-I 

where 

fid, 1) - a, -1) = -a 

A(l, -1) = M-l, 1) = 0. 

Neglecting the tenna oai/2 and aa„/2 in F{oti , • • • , , one can answer questions 

concerning this problem by evaluating Znix) as defined by (15), In this case 
P, has the form 

\1 - 1> re 


Its characteriHtic roots axe 

Xi., sa p cosh ax + [p^ cosh* m + (1 — 2p)]^ = Xt,* 

1 h.t I ™ 1 p cosh 0® — [p* cosh* 0 ® + (1 — 2p)]* | < Xi.® . 
aiul its charactoristio vectors: 

>Pi,> -■= [(p ■“ 1)* + “■ 


V'il.* l(p 1)° + (pc'”' ^2) ] 



Binee 


;= Ai = lim a log Z„{x)ldx, 
*-*0 


one ean hIkiw in the present riroblcm that P = Q. Therefore, the probability 
of the transIalMl point being a distance between £ and £ + h from the origin 
after (n + 1) translations, is, mn ^ and /i — » 0 

F(£ + h) — P’({) /i(27rnLs)"*e“^’'“"^'' 


where In. ia by (32); 

Ls sffl Urn d* log Xt,*/d® = a*p/(l "" P)- 
^•-»o 


Thus, 
When p 


pQ ^ A) — F(£) h[o*2imp/(l — p)]”*e 
2/3 this problem is equivalent to the determination of the proba- 
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bility distribution of the components in an arbitrary direction of the cirstunce 
between the ends of a linear polymer. In this case 

+ h) ~ ~ A(4aV)iF* exp (— {*/4na^) 

a result obtained by Tobolaky [12] after a lengthy and complicated comliinatory 
calculation. 

Another type of simple chain is encountcrwl in the determination of the 
‘‘life span” of a particle which is displaced a unit distance to the right or left 
per unit time along a straight line until it collides with an absorbing iKuindary 
either — (g -f- 1) or (p + 1) units from the starting point. This prediem has 
been analyzed by M. Kac using the methods discussed in the present paper. 
We shall generalize his results to include the effect of an attraction of the particle 
toward one end of the line so that displacements toward that end arc more 
probable than those in the other direction. 

Following the notation of Kac [13] we let X / represent the jth displacement, 
mj its length, and S(m} the probability 'of a given displacement liaving the 
length m. Then, 

s if ni •= 1 

S(?f0 = 1 — 8 ifm=--l 

0 otherwise. 

If N represents the life span of a particle, the probability of its exceeding n is 
Prob {N>n} ^ Prob {-q ^ Xi < p, -g < Xi + ^ p, ■ ’ ■ , 

— 5 + Xi + ‘ + Kn :< p( = S 6{mi)S{m%) • ■ > {(niR) 

where the summation extends over all integers mi , Vk, ,m„ aieh that 

-g ^ mi < p, -g < mi d- ?na < p, ' • • , ~g ^ mi + ?ni + • • • +• rUfl < p. 

Defining the new set of variables 

«y = g + mi + ma 4- ■ • • + my (j = 1, 2, n) 

we see that 


Prob (iV > n) — ^ 5(ai — q)S(ai — oi) > • • 5(a„ — ««„]), 

As before, if we introduce the P matrix (of p + g + 1 rows and columns) 

0 0 

1-8 0 

0 1 — 8 


P = («(c. - 0 )) = 


0 1 — 8 

8 0 

0 8 
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WO obtain after applying the equivalent of (22) 

Proh {N > n] = ^ ^ ^^,(an). 

/-"I an“»0 

Where is the jth characteristic value of P, and and (pj are its associated 
characteristic vectors as defined by (19) and (20) (here the range of a starts 
from 0 instead of 1 as in (17) and (19)). 

It is easy to show that the characteristic values of P are 

Xy « 2 (h(1 - s)]*COBfy (j = 1, 2, ■ • • , p + !Z + 1) 

where 


fi = n/ip + g + 2) 

and that the components of the characteristic vectors are 

V'/(a) “ [2/(p + g -f 2)]*[s/(l - s)]*" sin (a + l)tj (a = 0, I, • ■ • , p 4- g) 


and 

<f>j(c() ^ [2/(p + g + 2)]‘[(1 - s)/a]‘“ sin (a + l)tj . 


Since 

c±j 

Z! 

OnwO 


V2 (1 - a) |1 - l(-l)^[s/l - sinr,- 

1 - 2[8(r - s)]» cos ti 


we finally have 
Prob {A’’ > n) 


P + g + 2 

jl - (-I)Vl - cos" ginf, sin (g + Df/ 

/-j 1 — 2-\/ a{l — s) COB f i 


When fi = I this reduces to the result of Kac: (* means summation is only over 
event’s 


Prob {N > n) 


9 y+2+1 

— ■ — ^ 2 * cob" fy sin (g + !){•, cot . 
P + g + 2 y_i 


4. simple Chains with Restrictions. Often when studying chains of dependent 
events, certain functions avtoaged over the entire chains are known to be 
restricted between definite limits. That is, there might exist k functions 
gX“i , « 2 ) • • • , an) euch that 

(44) —AQj <Gj - gj(ai , • • • , On) < A(?y , (j = 1, 2, • > • k), 

where the 0/a and AO/a are preassigned constants. To calculate averages of 
other functions (1) is no longer valid, for it is an unrestricted sum over all sets 



32 


BLLIOTT W, MONTUOLB 


of a% including those incompatible with (44). All unrwtTictol himw in this 
formula (and other similar ones) must Ikj replacwl by Humn over only those 
sets of a’s compatible with (44) . Since it is sometimes more difficult to evidimti* 
restricted sums than umnstricted ones, we shall apply an irlca of Markoff (l| 
to the reduction of the former to the latter lyfie. ^ 

Let us seek an explicit expression for a ftinctiori /■’„(«) , tr? , • • ■ > <*«) whir4i 
has the property: 

Pt(ai ,•••,«,) - P„iai , ■ • • . when n's are elioM>n 

so that (44 ) is satis- 
fied of all j. 

0 otherwise. 


Since the Dirichlct integi'als 


have the property 


„ 1 /*“ sin (p,A(7,) f. ^ , 

S . = _ — exp (tp; y,) dp, 

TT Pf 

Sy = 1 when —AGj <yj < Adi 
0 otherwise, 


P*ni<Xl , < • ' , «r.) « SA • • • fi*Pn(ai 

has the required character provided 

yj~ Of - gj(ai , • • • , a„). 

The average value of a function F{ai ,•••,«„) can be written in terms of tiie 
unrestricted sum 

= Z) F{ai , • • • , a„)Pt(oci , • • ' , ocn)/ Z Pti^i » ■ • ■ , «„), 

l“(l l«tl 

where the summation extends over the complete set of {a,)’8 

{a.} = (ai, aj , • • • , a„). 

As in the case of chains without auxiliary restrictions, a useful function is 
2n{x) = z , On) exp {®P(ai, ■ • • , a„)j 

l«,l 


-if ■■■!' P. n o"-”- dp.] 

w* J— CO *^00 1^ Ptrt j 


where 


S«(a:, pif’--,ph) = Z Pn(«i, ■ ■ ’ , OLn) 


xF(ai, •■•,«„)- f Z) Pig, {oil, • • • , . 

J-l J 
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When F(oi , ■ ■ ‘ aiul (^jCai , ' • ■ , <i») 1 are all additive or multiplicative 
fimctionH of the form (13a) and (13b), say 

FWi , • • - , ofn) = 2 K<^k , a*+i) 

aito) 

A**l 

and the probability chain is a simple one, ^fn(x) reduces to a simple form. 
Suppose, 

ft— I 

l\(txi , • • • , an) = X) p(«, , a,+i) 

/-i 

then following the derivation of (24), we have 


(4(1) .S'„(x. PU-- ,Pk) = {Xax.,}'-'(4>,,x.p-l)(l-'J5^u,p) 

where and arc characteristic values and vectors of the matrix 


*.p 


1) • ' 

• • px,/.(l, v)\ 

\v*.piv, 1) • 

■ • P»,p(^ v)/ 


and 


P*,p(«, &) = P(a, 0 ) exp {x/j(a, » 12 P,(7;(a, ^)). 

J 

Sub.stitution of (46) into (45) allows one to calculate Zn{x). 


6. More Complicated Chains. In a chain of N events in which the result of 
each event depends on those of its n predecessors (?i << N), the calculation of 
Zn(x) proceeds in essentially the same manner as in the case of a simple chain. 
Let the N events be divided into N/u sets of "grand events” of n simple events 
each (for simplicity we assume N is divisible by n, this can easily be avoided) . 
Thus, if each simple event could lead to any one of v possible results, a grand 
event could lead to anyone of r" possible results and a complicated chain becomes 
a simple chain of grand events with the result of each grand event depending on 
the proceeding grand event. Quantitative calculations thus proceed formally in 
the same manner as in a simple chain. 

6, Continuous Case. In this section we genoi'alizo, l)y studying an example, 
to the case in which each event in a simple chain may load to any one of a con- 
tinuum of re.sults. The example is a problem arising in statistical mechanics of 
molecular chains. 

Consider a linear chain of n identical molecules whose centers of maps remain 
at a set of fixed regularly spaced positions, but which may rotate about their 
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centers of mass in a plane. Suppose, that the potential energy of interne, tion 
between neighboring pairs of molecules is a function of the angles a specifier! 
axis of the molecules make.s with the line connection the centers of maas of the 
molecules; that is, the potential energy of interaction tyetwecn pairs of adjacent 
molecules can be written as V{0f, dm). Assuming that forces are sufficiently 
short ranged for interaction between more distant neighliors can bo negleetccl, 
Boltzmann's theorem states that the prohnliility of the axis of the first moleonlc 
making an angle between di anil 8i + adi with the, line of centers of the chain, 
the second between flj and flj + ads and the nth between 6„ and $„ + d8„ ig 
proportional to 

exp [-kT (7(fli . Os) + V(0s , Sa) + • • • + 7(e„_t , 

where k is Boltzmann’s constant and T is the absolute temperature. The 
contribution of the interaction to the thermodynamic properties of the chain 
can be derived from the partition function 

" I I ‘"I 

f 1 1 

fNP [7(01, 0s) + ^" + V(8„-i , O„)nd0, • . . . 

For example, the internal energy is 

S == d log Z„/d{-]/kT) 
and the specific heat is c = OE/dT. 

It is to be noted that Z„ is exactly the integral of the iterated kernel of the 
integral equation 

(48) = jf exp 1“ ^7(5i, 0 j)| cWj. 

If 7(9i , flj) is symmetrical in 5 i and 62 , this linear homogeneous integral equation 
has a set of orthonormal characteristic functions {fj{9)] such that 

(49) j[ hi0)M0) do = 8ik. 

To each of these characteristic functions there corresporids a characteristic value 
\j . Now it is woll known that the kernel of (48) can be expanded as a seriw in 
its characteristic functions, 

{- <>*)} = S 

Inti'oduction of this expression into (47) and applying the orthogonality condi- 
tions (49) one obtains 

z„ = hiB) . 


(47a) 
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Probably the most interesting example of a molecular chain of the type 
described above is a chain of magnetic dipoles which are restricted to rotate only 
in a plane. In that case 

1 

I fij’+i) = ^[cos ( 6 , — — 3 cos dj cos Sj+i]. 

Where ^ is the magnetic moment of each dipole and r is the distance between a 
pair of adjacent centers of mass. This potential function leads to the integral 
equation 

» jf exp ^:^[coa (fl, - 0a) - 3 cos Si cos 0a]| de^ . 

Since this equation is rather complicated to solve, we shall devote the rest of the 
section to a potential function of less physical interest, but which leads to a less 
formidable integral equation. 

In studying hindered rotation of molecules, one sometimes uses potential 
functions of the form ; 


Vi 6 i, 8 {+i) = cos (0, - 0y+i) 
where 0 is a constant. With this potential function (48) becomes 


(60) 

whore 


X^(0,) a / yf,( 0 ^) exp {/ cos (01 — 02)1 d02 

Jq 


J = fi/kT. 

The characteristic functions and characteristic values of (60) are easily found 
with the aid of the Fourier Series for exp {J cos 0) : 


(51) 


exp (J cos 0) e Jo(d) + 2 2 cos m 0 


m-il 


where 7»,(d) is the mth Bessel function of imaginary argument: 


IM) » E 


-o(m + fc)lfc! ' 


From (61) 


exp [J 008 (01 — 0j)] — Io{J) + 222 7m (*7) (cos indi cos mdi + sin mdi sin nWi). 

mt-1 

Substituting this expression into (60) we have 

X^(0i) - rnSi) \W) + 21) /m(7)(cos m0i 008 W102 + sin m0i sin nWi) J d02 . 
•0 I / 
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Because of the orthogonality of the trigonometric fimctiorw, one can verify l)y 
direct substitution that the characteristic functions are 

ue) = 1/C27r)‘ 

^m’(S) = sin ?}iS; V'”’ ~ tt”' f-os ni0, (m 1, 2, • • • ) 

and the corresponding chwacteristic values are 

Xo — 2ir/(i(>/) 

X'.'^ = Xi*' = 27r/„(J') 7n > 0. 

Introduction of tlieac characteristic functions and values into (47a) wo obtain 
the simple formula for the partition function; 

= 2t [2^/o( 

The internal energy of the molecular chain is therefore 

E = alogB„/9(-l/fcT) 

= -|3(n - 1) h(J)/W).. 

and the specific heat is: 

0 = OE/dT = IHn - - 

( 
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ON THE FIRST TWO MOMENTS OF THE MEASURE OF A 

RANDOM SET 

■ By L. a. Santal6 
Vnmcmdwl Nadonal del Liloral, Argentina 

1. Introduction. In a roront paper [3] II. E. Robbins derived general formulas 
for Ihe moments of the mcnusiire of any random set X, and applied the formulas 
to find the mean awl tlus varianwi of a random sum of intervals on a line. In 
subsetiuent papers, ,1. Bronowaki and .1. Neyman [1], u.sing other methods, found 
the varhuiee when A' is a random sum of rectangles in the plane, and II. E. 
Rr)bhina [4] found the variance Avhen A is a random .sum of n-dimensional 
intervals in a-dimensional euclidean space. In the latter paper Robbins 
solved al.sfi the corresismcling problem for circles on the plane. 

Using the methods of Rribhin.H, our purpose in the present paper is to .solve the 
following similar problems: 

(i) frf't U denote the reel angle eouBi.stiiig of all points {z,y) such that 0 < x < Aj , 
0 < y < da , and lid R’ denote the larger rectangle for wlucli — 5 < .x < Ai -t- 5, 
—5 < y da + 3. I/’l p denote a rectangle of fixed dimensions, a X h, but 
variable position in the plane. The position of p will be determined by the 
eottrriinalr.s jr, ;/ of its center P and the angle tp between the .side of length a and 
the x-iixis. We suppow (a‘ H* !/■*)* < min (di , As , 5). Let a fixed number N of 
rectangles p be chosen independently with the probability den.sity function for 
the cotirdinates (.r, y, ip) of i*ach rectangle constant and equal to ^ ir W in the 
lliree-tlimenHitmul interval with base R' and height tr and zero outside this 
interval. In sectinu 3 we evaluate the fir.st two moments of the measure, of X, 
where A' denotes the inter, seetion of the set -theoretical sum of the N rectangles 
p with R. 

(ii) k‘t R denote the n-dimensional interval consisting of all points (xi , xa , 

Xa , ' , x») such that 0 < Xi < A, , (i = 1,2, • • • ,n), and let R' denote the 

larger interval for which — 5 < .r, < A,- -|- 5. Let a fixed number N of n-dimen- 
sioiud sphenw with rwlii r (such that 2r < min (A, , 2 6)) be chosen independently, 
with the probability density function for the centre of each n-sphere constant 
and equal to If IP hi IP and zero outside this interval. Denoting by X the 
intersection of the set thwn-etical sum of the N n-aphercs with R, wo evaluate 
in section 4 the first two moments of the measure of X. This problem is a 
generalization to n-dimenmonal space of the. cose considered by Robbins for the 
plane (a Ml 2) in [41. 

2. Preliminary formulas. Let K be an indeformalile piano convex figure of 
variable position in ihe plane. The position of K may be determined by the 
coordinates (x, y) of a point P fixed ivithin K and the angle (p Avhich measures 
the rotation of K about P. We shall call x, y, v the coordinates of K. The 
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measure of a set of figures congruent with K is definerl as being the integral of the 
differential form 

(2.1) dK = dxdydip. 

It is readily shown that this measure does not depend on the particular point P 
chosen to determined the position of A"[6]. For instance, the measure of the 
set of figures K, each of which contains in its interior a fixed point Q, has the 
value 2 irf, where F denotes the area of A; that is, 

(2.2) f dK« 2tF. 

JQ tK 

Let Pi and Pi be two fixed points and let I be the distance PiPt . The measure 
of the set of figures congruent with K, each of which contains both points Pj 
and Pi in its interior, will be a function of K and I, my ii(A, 1). If d is the 
diameter of A, that is, the maximal distance between two jmints of A', have 
m(A, 1) = 0 for Z > d. 

Examples. Let A be a rectangle p of fixed dimensions a X b, and let us 
suppose a < h. The diameter- d of p is d = (a* + bV. Let P(x, y) he the 
centre of p and <p the angle which forms the side of length b with the segment 
line PiPi of length I If we keep first ip constant, then in order that there exist 
positions of p in which it contains the segment line PiPj in its interior it is neces- 
sary that 

a - Z sin v> > 0, Z> — Z cog p > 0 

and in this case the area covered by the centres P in all these positions has the 
value 


Integrating over all 
(2.3) p(p, Z) 

where we define 


(a — Z sin p) (b — Z cos p). 

» 

permissible values of p, we obtain 

J ptroalnld/Di 

(a - Z sin p)(b — Z cos p) dp 

BTCOOStb/Ilt 


x if a ^ 1 

lifs ^ 1. 


Carrying out the obvious integration in (2.8) we have 
2 ffob - 4 Z(o + b) -f. 2 Z* 

4(ob arc sin. (a/l) - ^ o* - bZ •+• b(Z’ — 


g(p, 1) = < 


4(ob arc sin (o/Z) — arc cos (6/Z) + b(Z’ 
+ a(Z“ - b‘)‘ - J(o’ + b‘) - I Z=) 


lor Z :S a b 

aV) 

for a ^ Z ;:S b 
- oV 

for o < b < Z. 


(2.4) 
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As aiiotlipi example, let R be the rectangle consisting of all points {x, y) such 

that 0 < -r < -'ll , 0 < y < Aj and let R' be the rectangle consisting of all points 
(.r, y) sucli that 6 f 

-6 < / < Ai + 5, ~S <y < Ai + 5, (a' + !,')» < min (Ai , Aa , 5). 

IitA iiH consider tlu* set of reetangles p whose centers belong to R' and do not 
contain eith(»r Pt or P.^ , Pi mid Pi being two fixed points which belong to R. 
Ui I be the distance /‘,Pj . According to (2.2) and the definition of yip, 1) 
the measurf" of the s(*t of rectangles p under consideration is 

(^•5) 2 jrT?^ — 2.2 TTp -f- p(p, f), 

where R' = (-U + 2 fii (As + 2 S) and p == ah. 

Ixit K he a plane convex figure, of fixed position in its plane. Let us suppo.se 
K to be (runslatefl a di.stance / in the direction 6, and let F{Km, I, d) be the area 
of the interfiRCtion of K wit h the translated figure. Obviously if d is the diameter 
of K, F(K, /, (?) 0 for I > il. In what follows we shall consider the function 

(2.0) HKJ) ^ r FiK,l,e)dO. 

Jo 

Example. Ix-t K be a rectangle R of sidas Ai , Aj . Let the symbol [a:], as 
in (I), be definwl by 

It is then retwlily mm that 

(2.7) F{R, I, 0j B3 (Ai — I sin 0] [Ai — I cos 0). 

For our purpose the caw in which I < min (Ai , Aj) is of Interest. In this case, 
caiTying out the immediate integrations, we obtain 

(2.8) f) = 2 T A,A, - 4 f(A, + As) + 2 

Let A'n.r he an « -dimensional sphere of radius r. Sn.r will denote also the 
volume of this sphere, that is, as is knoiNm, (see [2, p. 109]), 

e (Fr»)"» 

(2.0) 

Let U8 call the measure of a sot of spheres the measure of the set of their 
centers. That is, if the iwint P{xi , *i , > • ■ , x„) is the center of S„,r the measure 
of a set of spheres equals the integral extended over the set, of the differential 
form 




fx] 


X if X > 0 
0 if X < 0. 


( 2 . 10 ) 


dP « dxidxi ' • • dxn . 
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For instance, the measure of the act of Hpiieres , fiu-h of wWcli contains a 
fixed point Q in its interior, lias the value 

( 2 . 11 ) f 

where Sn,r is given by (2.9). 

The measure , D of the set of spheres S„,r , each of which contains 

totally in its interior a segment of length l{l < 2r), enimls the volume of the 
intersection of two-.ephercs /S'„,r whose centers are pUu’nI at the end point.s of the 
given segment. That is, niHn.t , D equals twice the volume of tin* siilierieal 
segment of an n-splicrc of radius r and semiaiigle a = arc cos (l/2r). Wv, will 
represent the volume of this .spherical segment by and it may lie calculated 

in the following way: The intersection of the n-sphere witli a hyperplane at a 
distance a: from the center is an (n - l)-dimenMonal sphere of riuliiis r' == 
- x^)\ Let )S'n-i,r' denote the volume of this (n — l)-dimoii8ional sphere 
(given by the general formula (2.9)). The volume of the Kidierical wginent, 
whose base has the radius h - r cos a, will lie 

^^H,r(a:) “ [ l.r' dx. 

Jh 


Putting X 
we obtain 


= r cos 6 and substituting for the expression given in (2,9), 


—Cn-D/a j.,1 (.<• (■« 

iS«,r(a) = I o do - ^ aln" 0 do. 


m 


Consequently we can write 

(2.12) ^ n{8,.r , 1) = 2S„„(a) 



sin" 0 (IjO, 


where is the volume of the (n — l)-diraensi()nal sphere of radius r and 
Qi = arc COB (l/2r ) . 

In (2.12) we may substitute 


/■ 

Ja 


sin" e dO 


- l)(n - 3) <•. 3.1 
~n(n - 2) < 4.2 


arc 008 (i/2r) 


2r\«.\ 4rY ‘ n{)i ~ 

, . (n - l)(n - 3) . • • 3.1 

^ n(n - 2) . . . 4.2 



(n-S»3 


(2,13) 
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for n evon, and 


(2.14) 


r nin" 

Jn 


Q (10 


(n. ~ 1)(« — 3) 


4.2 


n(n — 2) 3 

n-1 

i(n ~ 2) V 4r7 


HUi-l 

2r \n \ 




+ • • . 4" 


iry 

{n - l)(n - 3) • • • 4.2 
n(n — 2) ■ • • 5.3 


for n odd. 

In particnlar, for u — 2, 3 we liavo. 


(2.15) M(i^'s,r,0 =■ 4r’ [ sin’O (10 — 2r“arceos (i/2r) — ^ Z(4r' — (^)^ 

Jo 2 

(2.10) ix(^s,rd) = 27rr® f ainO dO = ^ ir/ — irVl + rf. 

wQ u xZ 


Wc! shall now ffoncMadizo llio formnla (2.8) to a-space. 

A direction in a*sim('C' may h(‘ given by the c.orre.sponding point on the surface 
of tlie n'dimcnsional sphere* of unit radius, that is, by the end point of the radius 
which is parsdlel to the given direedion. The parametric equations of the 

n 

w-.sidic're 2 ~ ' 

1 

^ cm ifii 

sin <p\ cos (p3 

(2,17) fa “ .sin v?! fitu v >2 cos (pa 


f„_i - sin v5i sin <p3 ■ sin <pn-% cos ip„^i 

fn — sin ipi sin <pa • • • sin pn-a sin p,i_i , 

where 0 < (pj < t for f < n — 1 and 0 < <pn-i < 2 tt. The element of area of 

this n-sphere has the value (sec, (2, p. 109]) 

(2,18) dcr = sin"““ ipi ,Hin"”’ tpi - sin (£>„_: d<pidifit • • • d<p„-i , 

A direction in n-dimensicmal space may then be given by the n — 1 parameters 
Vl> <Pi> ' " ) • 

(liven the w-dimonsional interval H consisting of all points (X) , a;a , Xa , ■ ■ • > x„) 
such that 0 < X( ^ (f = 1 , 2, 3, • • ■ n), and suppose that R is translated a 
distance 1(1 < min (At , A, , , • • • , A«)) in the direction (vi , ipn-i), 

the intcr.scction of the translated interval with if2 is a new interval wlioso volume 

n 

has the value XJ (A, — aJt), where x* = (fj (fi given by (2,17)). 
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Our purpose is to evaluate the integral 


(2.19) 


U(A>~Xi)d^ 

Je, 1 


extended over the surface Er of the n-dimensional sphere of radius unity. We 
shall denote by Em either the surface of the OT-dimensional sphere of radius unity 
or its area, given, as is known [2, p. 110] by 


( 2 . 20 ) 


B, 


2Tr“/» 



Because of the synunetry, the coefficients of all the products A j, • • • .1 
have the same value 



The integral extended over the whole surface E„ equals 2" times the integral 
extended over the ^portion for which > 0. Plcnce, taking into account (2.17) 
and (2.18) we get 



«j) = (' 

/■Tit 

Jo 

• • . sin"-'- 

Jo 

1, c . nU-S 

¥>1008.^18111 ¥>4 008 ¥5 

(2.21) 




■ • sin" * Vk cos ifit clipi dtpt * * ' diph 


-(- 

^ {n + k- 2)(n + 7c - 4) . 

• • {n“+k ‘~ 2k) 

jfOr k = 

1,2, 

, n — 1, Bor k 

= n we find that 



(•r/2 

/*/! 



“r. = (- 

-IT2T 

Jo 

1 sm v^i < 
Jo 

508 ¥l 

(2,22) 




sin ip„-i cos ¥h~i ^¥>1 dipt • • • dipM~i 


= (-1)" ^ t 

’ (2n - 2)(2n 4) - • • 4.2 ’ 

Hence, we have the following general formula 


m, l) • • • A A +(-!)" ,-5 

(2n - 2)(2a - 4) ... 4.2 






{n + h- 2)(n A; - 4) . . . (n + fc - 2fc) ‘ 


(2.23) 
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In particular, for n == 2 this result coincides with (2.8). For n = 3 we have 
$(22, 1) — 4Tr/liA2Aj — Z* — 2irZ(Airij + Ai/ls "h AiAi) 

(2.24) 

+ iZ^(Ai + Aj + As). 


3- First problem. Wc can now solve the first problem (i) stated in the intro- 
duction. Denoting by the same letters cither sets or their measures, we consider, 
as in [1] and [4], the set Y of points of 22 that do not belong to X. We have 
identically; 

(3.1) Z + 7 = 22. 

The general method of Robbins [3] taking into account (2,2) , gives immediately 
the first moments 

(3.2) E(7) - 22 (l - , E{X) = 22 {l - (l - , 

where 22 =• AiAa , R' “ (Ai + 25) (Aj + 26), p = db. 

Ovir remaining problem is that of evaluating the second moment of X. Let 
tVi = 1, 2, 3, • ' • yN) be the coordinate.? of the N rectangles p (section 2) 
and let us put, as in (2.1), dp,- = dxdvdvy • Lot P(x, y) and Po(xo , i/o) bo two 
points which belong to 22 and let us put dP = dx dy, dPo = dxodyo • Let us 
consider the following multiple integral 


(3.3) 



dP dPo dpi dp3 ‘ • dpY 
(2ir220'' 


extended over the sets of rectangles pi (congruent with p) such that xi , yi belongs 
to 22', 0 < ^ 27r, and do not contain either P or Po . That is, the domain of 

integration of J is defined by 


(3.4) 


— 5 < < Ai + 5, "”5 < 2/( < A? -j- 5, 0 < > 2ir, 

P tR, Po « P, P i P> , Po^ pi , (f = 1, 2, • • • , IV) . 


In order to calculate J, we can first keep the rectangles p,- fixed ; the points P 
and Po can then vary independently over the set of points 7. That gives 

/r, T f 7^ ^Pi ^P^ ’ “ * dpY 

(3.5) j 


We can now reverse the order of integration, an operation which is obviously 
justified in this case. Keeping P and Po fixed, we can vary each rectangle pi 
over the set of portions in which it does not contain either P or Po ; letting I 
denote the distance PPo , we have, according to (2.6), 

(3.6) / = f (l~ " dPdPo. 

FiRiPo<R 
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In order to evaluate this integral we divide it into two parts J — Ji + Jj , 
according as 0 < Z ^ d or d < J < 1), where d (a -1" h and D — (vl t + rla)^. 
In the interval 0 < Z < d we introdupc the new viiriahles of integration Z. e 
related to x, y, to , l/ohy 

(3.7) To = a. + Z fos fl, yo = d + I f*5n 0 
whence 

a(x,y,Xo,ih) ^ > 

(Z(-r, yi Z, 0) 

In terms of the new \'arialjlra we havo 

In this integral tlie point V’ can vary over tiic in^^'r^(‘(•lion <if R with the liguw* 
obtained by translating R a distance Z in the* direction n\ (Inil is, the integration 
of dP gives the function F(R, I, 0) defined in seeli<ii\ 2. ,\ecordiug to (2.(55 we 
therefore have 

(3.8) ./i - j[' (i " --- 'I’f/f. l)l dl, 

where g(p, Z) is given by (2. 0 und Z) by (2.8). 

In order to evaluate Ja we oliserve (Jtat in the inliM val d <' I < 1) n<p, h (t 
and we have 

dslcn ^ J a*. I 

Further we have 

(3.9) J dP dPo = F’ 

OjalSIl 

and wdth the change of variables (3.7) and the formula 12.8) we find that 

(3.10) f dP dPo = f ' HR, l)l di = d’ ~ (/I, d /is) p i d\ 

0 £ l^d 


Collecting (3.8), (3.9), (3.10) and taking into account (3.5) we Imve 

” TrAiAi -p ■f(Ai + yls) ct — 


(3.11) 
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where* p - al/^ II =” .li.-U , ft' ■= (Ai + 25) (/la + 25), p(p, i) is given by (2.4) and 
cKft,0 by /2 8). 

For the variHuee (if X and of F, we have by (2,1) and (3.2) 

<r% ^ mx'^} - i'f(x) = rnY^) - E\Y) 

1 , 

which ctirnplelPH the .solntinn of our tirst- problem slated in the introduction. 

4. Second problem. In f)rder tt) solve the second jjroldom (li) stated in the 
introduction w’c will follow the same method of the iirt'ccding section. 

Irf't ,Y he the inter-section of the set theoretical .sum of the N n-dimensional 
siiher(*s of radius r with the. n-interval ft, T.et us cull F the set of those points 
of ft that do not belong to .Y, that is, 

(4,1) A' + r = ft. 


The general incIhcKl of llobbins givos immediately 


(4.2) 


E(Y) = ft 





where ft =* H -4 .• , ft' (.4 i 'b 25), and *S„., i.s given by (2.9). 

Wo now proceed to calculate, ft(}’‘‘‘). For this inirposo lettidl/i ,vl , jl/n) 
and Qtiyl ,yl, , y\) be l.wo poinhs which belong to ft and P,{x'i ,x'„) 
bo the centers of t he X «pl lOros iS’n.r . bet us put 

( 4 , 3 ) dQt = dyldy.i • • • , (f = 1 , 2 ), dP,- = dxldxi ■ ■ • d*; , (t = 1 , 2 , ■ ■ • , ft). 


Consider the integi’al 


m 


j 



dQidQidPidPi ••• dP„ 
K'^ 


extended over the domain defined by 

(3i e ft, Qi eft, P, 6 ft', > r, > r, (f = 1, 2, ■ • ■ , ft). 

If wc keep Pi , Pa , Pa , < • • , fixed, each point Qi , can vaiy independently 
over the set Y ; consequently we have 

OJ 5) y = f = E (y“). 

Jp^tR' it" 

On the other hand, if wo keep Q\ and Qa fixed, the integral of each dP{ gives 
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R' — 2 Sn,r + M(Sn,r, 0 whetc n(Sn,r , D IS giveii by (2.12) and I » QiQj. 
Hence we have 

In order to calculate this integral wo split it into two parte J == Ji + Ji, 
according as 0 < J < 2r or 2r < Z < i), where D ^ In the interval 

I Vn-l 

(i = 1, 2, ' • • , n), 


0 < 1 < 2r we introduce the new variables of integration I, 'pit <ft> 
related to yl , y\ y\,yi tUl 


(4.7) > 

where is given in (2,17) . It is found that 

-/ 1 1 1 2-2 .1 
diyi, 2/2 J 


1 2 
, 2/njJ/li^ 


,2/n) 


= j"-> gin"”’ 91 sin""* 92 • • • sin 9„-5i. 

^{yl I ylt • • ' j 2/ n 1 h ¥>1 ) ' ■ ' 


Hence we have, 
(4.8) 


dQidQi = 1"“^ clQidadl, 


where dcr denotes the element of area of the n-dimensional sphere of unit radius, 
given by (2.18). The same method used in section 3 gives 


(4.9) 


_ 25n.r 4>{R, I)r‘ dl. 


where 4(H, I) is given by (2.23) . 

In the interval 2r < I < D a(-S„.r , 0 ==» 0 and we have 

(4.10) r . \ 

dQidQi - dQidQi}. 

[Jo^lSD J0^l£ir ) 

Now we have 


(4.11) 


[ dQi dQi = 


and with the change of variables (4,7) we readily find that 
(4.12) [ dQi dQi = r HR, 1)1"'^ dl. 

*'0S!S2r •'0 

Collecting (4.9), (4.10), (4.11), (4.12) and taking into account (4,5) and 
(2.23) we have 
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(4,13) 


E{Y^) l^(i- ^{r, di”-^ h 

4.(1- I -^^RE - (-\Y 
^ V K' / r •n- ^ 2n(2T!. ~ 2) , . . 4.2 


£(“!)*( 

*“i ;i. 




(n + fc)(?i -4- — 2) • > . (n + fc — 2 Jc) 


where]? » p[Af,]?' » ]^(>l<4-2S);iSr„.risgivenby(2.9),]5„by(2.20),/i(jS„,r, 1) 

by (2.12) and $(]?, 1) by (2.23). In particular, for n = 2, we obtain the value 
given by Robbins [3, (30)], by use of (2.8), (2.15) and the equations Si,n = tt’®, 
Ei « 2. For n « 3, the case of ordinary space it follows from (2.10), (2.24) 
and the equations <S'j.r = 4 ’rT^» iSa = 4 t, Fj = 2 tt, that 


(4,14) 



1 _ ^y^iTrR 2 T(AtAi+A:A, 

+ AiAijl + g (/lx 4- Ai 4 4a)Z'*^ t dZ + ^1 ~ 3 ^^ 


— ^ vR^ "b 87r(4ii4j 4 AtAi 4 4.j4.a)r^ 

<5 

2S0 , , . , 1 .d s 6 I 32 

- i-g (-di + .da + d,)r 4- y r > . 


In this case the exact evaluation is easy if one expands the binomial under the 
sign of the integral and integi'ates term by term. 

From (4.1) we see that 4 = F(X’) - B\X) = E{Y^) - E\Y). Thus, 
from (4.2) and (4.13) wo obtain immediately the second moment E{X^) and the 
variance a\ of X. 


6. Remark. In the second problem we can substitute the n-intervals R and 
R' by concentric n-dimensional spheres. The problem may then be stated as 
follows : 

Let S«,fl denote a fixed n-dimensional sphere of radius 0 and )S„,o+j the con- 
centric n-dimensional sphere of radius o 4. iSn.a and tSn,o+j shall also denote 
the corresponding volumes. Let a fixed number N of n-dimensional spheres 
with radii r (r ^ min (a, 4)) be chosen independently with the probabilitydensity 
function for the center of each »S»,r constant and equal to 1 //S„,b+j in S„,a+t 
and zero outside this ra-sphere, Ijet X denote the intersection of the set-theo- 
retical sum of the JV n-spheres with j we wish to evaluate the first two 
moments of the measure of X. 
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It suffices to observe thiit iu this caw' we liuve 


(5.1) 


^HSn.a , 1) 


( a 


win** 0 1 10 


where S„-i.n is the volume of the (« 1 MlimenHonul sphere of rariism a sinrl 

a = arc cos (,l/2a). 

The same method iiserl in seetifui 4 gives 



where 'I’Cfin.a , 1 ) is given liy (h.l). 

In particular, for n = 2. liy use of (5.1), (2.15) and the imlelinite integruhs 
J arc cos (l/2a)l dl = - a’) arc cos (//2a) - \ /pin’ /’)* + emistaiii, 

f IW - JV dl = -i/(4a' - /’)* + 5a’/(4a" - 

-h 2rt^ are sin -1« e^onstant 


we find that 


S(F“) = 2w 



2Tr/ — 2r’ arc cos (l/2r) 4- il(4r^ 
r(a + 6)* 




(2a’ an' cos(//2a) 


- Ji(V - lV)i a + (l - (r’a* - 

— 3a’ r(a’ - + ira* + 2r(a’ — r*)^ — a* arc 




2Tr ( 2a\2r^ ~~ a’) are ciki 


wu(r/a) 


)}■ 



For 71 = 3, we have by (6.1) and 2,10) 

- 4,r f (l - (t,ra> ~ + tV^f)f r« 

+ 4ir |^7ra“ — -^d^ra^r® +• 4m^r* — 

From (6,2) and (5.3) with the use of the relation vx B(X^) ~~ lf(X) » 
F(F“) - E’(7) we obtain immediately the second moment E{X^) and the 
variance vx of X. 
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ON A TEST OF WHETHER ONE OF TWO RANDOM VARIABLES 
IS STOCHASTICALLY LARGER THAN THE OTHER 

By H. B. Mann and D. R. Whitney 
Ohio Slate University 

1 . Summary. Beta; and y be two raudom variables with continuous cumulative 
distribution functions / and g. A statistic U depending on the relative ranks 
of the a;’s and y’a is proposed for testing the hypothesis/ g, Wilcoxon proposed 
an equivalent test in the Biomclrics BuUelin, December, 1946, but gave only a 
few points of the distribution of his statistic. 

Under the hypothesis / = g the probability of obtaining n given 17 in a sample 
of n a’s and n y’a is the solution of a certain recurrence relation invohdng a 
and m. Using this recurrence relation tables have been computed giving the 
probability of U for samples up to n = m == 8. At this point the distribution is 
almost normal. 

Prom the recurrence relation explicit expressions for the mean, variance, and 
fourth moment are obtained. The 2rth moment is shown to have a certain 
form which enabled us to prove that the limit distribution is normal if m, n go to 
infinity in any arbitrai’y manner. 

The teat is sho^vn to bo consistent with respect to the class of alternatives 
/(a) > g(x) for every x. 

2. Introduction. Let x and y be two random variables Imving continuous 
cumulative distribution functions / and g respectively. The variable x will be 
called stochastically smaller than y if /(o) > g{a) for every a, We wish to tost 
the hypothesis / = g against the alternative that z is stochastically smaller fclmn 
y. Such alternatives are of great importance in testing, for instance, the effect 
of treatments on some measurement. One may think of a: as the values of 
certain measurements in the control group and of y as the values of the same 
measurement in a group which received treatment. In a particular instance 
the protective effect against infection by certain bacteria was investigated. 
Two groups of rats were used in the experiment. The first group receiving no 
treatment, the second group receiving the drug. Both groups were then infected 
with supposedly equally diluted cultures of the bacteria under investigation. 
Most of the rats in both gi’oupa died, but the tune of survival was measured and 
it was desired to test whether the drug had the effect of prolonging the life of tlie 
rats. It was desired to make inferences from the effect on rats to the effect the 
drug would have on humans. Thus, the only relevant alternative to the hy- 
pothesis that survival times are not inffuenced by the drug is that the survival 
time of those rats which received treatment is stochastically larger than that 
of the control group. 
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3. The U test. Let the quantities a:. , • • ■ , , 3/1 , ■ • ■ , be arranged in 

order. Thin arrangement i.s unique with probability 1 if P{x, = 1/,) = 0 and 
this follovv.s from our assumption of continuity, Let U count the number of 
times a ij precedes an x. If Pfr’’ < V) = a under the null hypothesis, the 
te.st will be considered significant on the, significance level aii U < t7 and the 
hypotheHis of identical distributionR of x and y will be rejected. 

This test was first propowsl by Wilcoxon [1], His statistic T is the sum of the 
ranks of the i/'s in the ordered miuencp of a'a and j/’s. In general 

r . + ^.±..U - T 

2 

and this gives a simple way of computing U. Wilcoxon, however, treated only 
the case m => n and in this case he tabulated only 3 points of the distribution of 
T. Kinee the test wems of great utility it seemed w'orthvvhilc to compute the 
variance, the moments and tlic limit tlistrihution of U and to investigate the 
doss of aiternative.s with r(*spect to whirh the test is consistent. 

Althougli this paper i,*- written in terms of and the probabilities of U are 
tabuhvter! the results t mi 1 k> easily interpretwl in terms of T if so desired, 

4. The distribution of U. Consider now cmlercal sequences of n x's and 
m y'». Hince if is only the relation lietween x and y that matters we replace 
each X by a 0 utui eacli p by a 1 . I*t‘l f ' count the number of times a 1 jirecetlcs a 
0. liOt f * I be tlm number of miuences of 11 0’s and m I’s in each of which a 1 
Iirocedes a 0 L' liinw. by esarnining a Hequenec with the last term omitterl wo 
arrive at the recurrence relation : 

fu uv ~ m) + 

where b if f ' < b are ssero or one according 

as U ^ tlor {■ » 0. 

Under the null hyjKtlliesis each of tlie f«t + «)!''»(!«! sequences of n O’s and 
m IV is equally likely . C'«mf«‘qHent!y if p«^tV) repnwonts the probability of a 
sequence in which a 1 ftreei'<leH a 0 V timi^ then 

(1) ^ P-^ “ Jf) + ^ P««.-.i(t’). 

Using the rmPTMici* relation ti i the probahilities p««CfO iiave been lahulatcd 
for m < « < H (w Table 1 1 . For m » 8 tlie slistrihutiim of t' ~ |(nm + 1) 
diflersoiily a negHfpble amount from the nomial distribution. We shall, in thft 
following, derive the mean, the varimiee, and the fourth moment of U, and 
prove that the limit diatriliwthm of U is normal if n and m Imih approach infinity 
in any nrhilrary manner. 

Itw ohvduuB that •« 

Since the prolMihility *if the »th I prmding theith 0 i« §, we have 

( 2 ) gum ® 
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TABLK I 

ProbabiUiy of Ohlaininy a f’ not Larger than thal Tnhulaleii ih Cnniparitig Snmplrn iij 

n and m 

n 4 


n = 3 


\ . 

1 

z 

2 

0 

.250 

.101) 

.050 

1 

500 

.2W 

.UK) 

2 

.760 

.100 

2(R) 

3 


.00(1 

.350 

4 



,500 

B 



.050 



(1 


1 .1H17 : 

.02K 

; .014 

1 

( . IIMI 

.133 1 

.0,57 

.02!) 

“ 

i m) 

.2117 ! 

.11 f 

.0.57 

3 

1 

1 

KHI 1 

2tK) 

1 .100 

1 


(UK) J 

.... .] 

.311 

.171 

5 



.421) 

.213 

0 

1 


.r.7i 

.3)3 

7 

1 

‘l 


1)3 

K 

1 



.1.’i7 



n h 




! 1 1 

1 I .. 

4 

5 (. 

It 


1 

: 



0 ) 

.H3l 

.0301 

.012; 

.)K).5 

.(Kr2 

IHlli 

1 1 

.2K(V 

.1)71 

.024 i 

.1)10 

IK)I 

.(H)2' 

2 1 

. 128 

. M3) 

.018: 

.01!) 

)KK) 

.IKll 

« 1 

.571 

. ... 

.2M| 

0H3' 

.033 

.01.5 

.(K)K 

4 


’ 1 

.32l| 

.13l! 

057 

020, 

' 1 

.1)13' 

5 

■ 

.-12t)j 

.IIK) 

080 

on, 

.021 

0 


.57l| 

.274^ 

.120' 

fMi3 

032 

7 


; 

— 

..3S7 

. 17(5 

,081) 

“ ■ 

.047; 

8 



.462 

.238, 

.123 

.000 

0 



.648 

305 

.106 

.000' 

10 




.381 

214 

.120 

11 




,4,57 

.208 

.lf)5 

12 




,51.5 

.331 

.107 

13 





.308 

.242 

14 





.408 

.204 

15 





.636 

.360 

16 






,409 

17 






.469 

18 






.631 
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TABLK I {CoiiLimcd) 
11 >= 7 




54 


H. B. MANN AND D. R. WHITNEY 

























































A TEST 


55 


We now seek an expression for Enm(u^) where u = U — nm/2. After multiply- 
ing (1) by (U — nml2)^, using 

- T,iU - nm/2fpUU) 

V 

and expanding: 

(3) -Btim(u ) = , H Enm--l(ti'‘) + nml^:, 

n + m n + wi 

whei’o E„m{u) denotes the expectation of (17 — nm/2) in sequences with n O’s 
and m I’a. The initial conditions of (3) are seen by direct calculation to be 

(4) = EUu^) = 0. 

By substitution EnmW) = nm{n + ?» + 1) /12 is a solution of the recurrence 
relation (3) and its initial conditions (4). Hence, it follows by mathematical 
induction that 

(6) Enm{v/) = nm{n + m + 1)/12. 

The fourth moment is similarly a solution of the recurrence relation 

(6) E..(u‘) - E..Uu‘) + &.->(»-) 

+ ^^i2n% + 2mf —n^ ~ m* —nm) 

which is obtained from (1) by multiplication by (t/ — nm/2y and expansion. 
The initial conditions of (G) are foujid by direct calculation to be 

(7) E.,{v/) = E,„{v/) = 0. 

It may be vei'ified that 

(8) Enm(u*) = {5tiin -h 5nm^ — 2n’ — 2m^ -4- 3nm — 2a — 2m) 

satisfies the recurrence relation (6) and its initial conditions (7) and hence (8) 
follows by mathematical induction. 

To investigate the limit distribution of w as », become infinite we investigate 
the rth moment. Following the same procedure as in the case of the second and 
fourth moments and using the symmetry of the distribution to find the odd 
moments zero we get the following recurrence relation, 

(9) EnUu^l = I 

For r = 1, 2 it is known that Enmiu^') is a polynomial in n and m of degi'oe 3)' 
and that it is divisible by nm(n + w + D* Assuming that Enmiu^), a < r 
is a polynomial in n and m of degree 3a divisible by nm(n + m + 1) we will 
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show that it is possilile, to find a iiolynnmial of dcpiw .'fr in « and m l>y 

nm(n + m + 1 ) ivhidi siitisfios fhc m‘um*iici* n'hifioii (!)) fur and 

also its initial conditions, lunncly, Er,(t( K*' i ■- ii^' i 0. 

Thelastconditionis IrlviallyKiitisIiwlifTi'n-jfif^'i isdivisihh'hy o?n*w J n> i p. 

Our method here is to actually hubstituto a polynotniid with undi'iirrminfw} 
coefficients into (9) and shmv that the roidliricnls can bo oitlainr'-d iinirjiifly. 
Rearranging (9) we obtain 


( 10 ) 


n + m 


m 

11 + m 




11 + m a-i \2a/4‘' 

Since for X < r we can write mwi(w d- ni d 1 wln>n* iji 

a polynomial in n, m of degree 3X --- 3 the above equtilion retluces to 


( 11 ) - ’i- sbr’i 

n 4- wi a + m 

where Qn’iir" is a polynomial in n, in of dcgris* 3r 3. 

Now let 

3r 1 

Enmiu^') ~ nm{n + m + 1) XI 

S 

wherea;y= op are to bo doterminwl. Substitution in (Ulydelib: 

Sai,[(n 4- m + l)n}nif - (u - I)(n- l)'m’ - (m ~ lj«'(m I d| • > f/;;® 


and reaiTangeracnt yields: 


(12) 


1,)“0 

i +/ S 8 r -» 


a.y 


4- 


i; (' t ‘) 


auti \ 


(- 1 )' “{ti^ni' 4 - »*m' 


>] 


" *%, ftint » 


Consider first the terms of degree 3r — 3. In lliis cose i + j 3r — 3 anil 
a = i will give 


3r— 8 

E a,8,_,_.[n‘m”-’-' + (i 4- 1 )(h’' 4“ n ‘ ® ')! 


(13) 3r X \ 

1-0 

3r - 3 to the eorrwfKuiding 
ones m Q^m it is possible to calculate the value of aur-®-! , (i » 0, • • ■ , 3r — 3), 
We assume now that the an are known for f 4- j > 3r — 3 — (A 1 . i| and 
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we toU find the value of where ^ + j = 3r — 3 — /c. Consider then the terms 
in (12) of degree Sr — 3 — k. These terms will occur when 

f + j = 3r — 3, £i! = i — fc;i+j = 3r~4, q: = i — /c+l;---; 


i-{-j = 3r — 3~k, a = i 

All, but the last, contain coefficients which have already been evaluated. The 
last one reduces to 

(3r - 3) 

i«0 

Thus by equating coefficients a, 8 r_ 3 _j,_, for i = 0, 1, • • 3r — 3 — A: can be 
evaluated in terms of the coefficients a,, alieady known and those in Q^nm^. 
This concludes the proof that Enm{u’^) is a polynomial in n, m of degree 3r and 
is divisible by nm{n + in + 1). 

We now investigate the coefficients of the terms of degi’ee 3r. For X = 1, 2 
Enm(u^) = ~ {nnifin + m + 1)^ + terms of degree < 3X). 


Wo assume this to hold for X < r and we will show that it holds for X = r. Sub- 
stitution reduces the right side of (10) to 


^ 2r(2r - 3) ••• 5-3-1 . 

+ mn’f — vr\m - ir'(n + m)'"’ 


D^-V-^n + m) 


r- 1 


-fi (terms of degi'ee < 3r) 


or 


■ [n(7i — + m(ni — l)’’"'n’’'*''] -fi (terms of degree < 3r — 1) 

which reduces to 


3r(2r - 1) • ■ ■ 5'3-l 
12 ' 


(nmYin + m)*^^ -|- (terms of degree < 3r — 1). 


Comparison of coefficients with (13) multiplied by nm gives 

(2r - 1) 6-3-1 


3r-.3 

1.-0 


12 ’- 


(nmY(n -f- m) 


r-l 


or 


(14) 


Snm(ri“^) = inmY(n H- m + 1)' 

-f (terms of degree < Sr) . 
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We now wish to show that S„„(u’') is at most of clegrop 2r in n or m. For 
r = 1, 2 this has already iieen establishwl. Assuming that it is true for lower 
moments the right side of (10), which reduces to nmCfnm^ is at most of degree 
2r - 1 in n. We again compare coefTicicnts in (12). B’irst, for terms of degree 
3r - 3 wo have already seen that n has degree at most 2r ~ 2, B'or terms 
of degree 3r ~ 4 we use i + j = 3r — 3, a = f — 1 and t + y » 3r 4, a » f. 
The first case gives rise to no terms in n of degree greater than 2r “• 2 so when wu 
solve for the coefficients a.-jr-i-.' the caefficients of U'rmH in n of degree greater 
than 2r — 2 must be zero. The proeesa repeats and we find no t<*rm« in n nr m. 
of degree greater than 2r — 2 in the left side of (12). This gives at 

most the degree 2r in norm. 

Now consider tlie ratio 

j. ^ EUv^') 

(2r — 1) ■ ■ ■ 5-3- 1 , ,r/ I 1 

V (nm) (n + m + 1) 

[nm{n + m + l)/!^]" 

, (terms of degree < 3r; in u or m, 2r) 
[nra(» d- m -|- 1) 121'’ 

= (2r — 1) • • • 6'3'1 4- of Jogrec < 3r: in n or w, < 2r) 

(7im)'(n + w 4- 1 )’■ 

Hence 

(16) Lim 7 = (2r - 1) 6-3-1 

and by a well known theorem it follows from (16) that the limit tliKtrilnition i* 
normal. 


6. Consistency of the U test. If / and g are the cumulative distribution 
filnctions of the k’s and j/'s then our null hypothesis is / == g. Tlie alternatives 
admitted are /(a) > g{a) for every a. Let denote the expoe.tatitm under the 
alternative. 

Defining 

Oif Si < yj 

Z{j = 

1 if ** > Vj 


■Ea(®h) = P(®( > vi) ==> f g df < ^ 

A M 

= P(®( > i/i ; ®, > 1/0 » I gUf <k 

A flO 

EAixikXih) = P(®i > 1/*, ®/ > i/i) = £ (1 - /)* dy < 


we have 
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We can now write 

= ^ — X, \ E^ixaXjk) = i — ft 

where X, ei , ej are positive numbers. 

We have then 

O'Ai^lf) X O’j.iXxjZik) = "b X — X* 

ffxixifXki) = 0 for iy-i k,j I - ej + X - X* 

Now 

(16) Eji(U) = ^EAiXii) — nm/2 — \nin 

ui 

and 

(17) (Ta(V) = S a^AiXii) + S CAixt^ik) + S (TA{x,]^jk) + S ffAixipCti) 
or 

= rm{n + w + 1)/12 

+ n7n[-X*(n + m - 1) + (X - 6i)(m - 1) + (X - t^){n ~ 1)]. 
Let the critical region under the null hypothesis consist of those t/’s satisf3nng 
nm/2 ^ U > tn<T where lim <„ = <. Then 

n-*« 

Pinm/2 -U> in<T\ A) = P(Ea(U) -U> k>CA) where k = ~ 

O-A 

and by Tchebycheff's inequality, since for large values of n,m k < 0 

which by (6) and (17) gives 

P(nm/2 — U > U!t\A) > I 

nm(n + m + 1) ^ ^ _ 1 ) ^ _ 1) _l_ (X — ti)(n - 1) 

(in V" nm(n + ot + 1)/12 — \nm)^ 

> 1 

1 2 

1 + ^4: ^+1 + TO - 1) + (X - 6i)(TO - 1) +(X - 6j)(n - 1)] 

\ T n + m + l/ 

We obtain then that 

Lim P(nTO/2 ~ U >tna\A) = 1 
which is the requirement for consistency. 
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6. Comparison with other tests. Another tp.st which miglit seem iippropriaU* 
for the comparison of a control group with a group receiving treatment iw the 
test introduced by Wald and Wolfowitz [2J. The test by Wald and Wolfowitz is 
consistent with respect to every alternative g. However in the ease conaideretl 
we are only interested in the alternative hypothesis that meiisiirc'nients in the 
group receiving treatment are stochaslieally larger than in the contnil group. 
Intuitively, it seems that the. test propo.sed here is more efiicieni for detecting the 
particular alternative considered than tlie teat proposed by While! and Wolfowitz. 
This intuitive feeling was borne out by the ro,sulla of the test in the parlieiilar 
experiment described in the introduction. All in all, (i2 cxperimentH were 
conducted using vwious bacteria in different solutions and various luivmmts of 
the protective drug. The IJ Test gave 14 significant re.sult-s on the. 5% level 
and 4 on the 1% level. The te.st of Wold and Wolfowitz ga\‘e 7 significant 
results on the 5% level and 2 on tlic 1% level, A final decision between the two 
tests can, of course, only be arrived at on tlie basis of their power funcliorw, 
which piesent formidable difricultie.s. 

In comparing the two statistics it was noted that a slight dislocation of a 
value may cause a significant change in the luimher of runs easier tlian it can 
cause a significant change in the .statistic propo.se(l here, For iiislance, in the 
sequence xi.rj.ts.TiTd.roiyii/sMiMs ''"tli statisfies would give a probnliilify leas than 
,05 If however, the sequence is slightly alterisl In , 

P (number of runs < 4) > ,()5wliile/'(f' < U = .(X)2, 

After com])letion of the present paper it came to the aulliors attention that 
the U test had already been proposed by K. K. Matheii [.'I). However .Mafheii's 
distribution of U is incorrect and its derivation erroneous, since it assumes 
independence of the random variahles r.j as defmiHl in seetion 5 of Ihi' finwnt 
paper, while obviously .tf, and x,k are not independent. 
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ON THE CONVERGENCE OF SEQUENCES OF MOMENT 
GENERATING FUNCTIONS 

By W Kozakiewicz 

University of Saskatchewan 

1. Summary. The purpose of this paper is to give a few theorems coa- 
corniog the reciprocal relation between the convergence of a sequence of distribu- 
tion functions and the convergence of the corresponding sequence of their 
moment generating functions. 

The paper consists of two pai’ts. In the first part the univariate case is 
discussed. The content of this part is closely related to that of a recent paper 
by J. H. Curtiss [1, p. 430-433], but the results are of a somewhat more general 
nature, and the methods of proofs are different and do not make use of the theory 
of a complex variable. The second part deals with the multivariate case which, 
as far as the author knows, has not been treated before with proofs in as com- 
plete and rigorous a way. 

Tn both the univariate and multivariate cases the proofs arc based on the well 
known Helly selection principle [2, p. 26] for bounded sequences of monotonic 
functions. 

2. The univariate case. Let X be a random variable and F(x) its distribution 
function. That is, for any real x, F{x) = F{X < a;] , where P(X < .T!} denotes 
the probaliility of the event X < x. The function 

V5(t) = E(e‘^) = e'^' clF(x}, 

•/— bO 

in which the integral is taken in the Stieltjcs-Riemann sense and is assumed to 
converge in some neighborhood of the origin, is called the moment generating 
function of X (or of F(x)). 

Henceforth we use the abbreviations d.f. and m.g f. for the terms distribution 
function and moment generating function respectively. The variable i will be 
always real. 

Theoeem 1. Let {Fn(-'c)! be a sequence of df.’s Let M{x) for any fixed 
non-negative x be the least upper hound of the sequence {Fn(— x) + 1 — F„(x)). 
If the sequence {F„{x ) } converges on an everywhere dense set of points on the x-axis, 
and if there exists a positive number a such that for any fixed t in the interval \t\ < a 

(1) lim c'‘'*ilf(x) = 0, 

then: 

(a) there exists a d.f. F(x) such that lim F„(x) = Fisf) at each point of continuity of 
of Fix); 

(b) the m.g.f.’s of Fix) and F„(x), say (pit) and ipnit) exist for | i [ < a; 

(c) lim Ipnit) = ipit) /or 1 1 1 < « and uniformly in each interval j t j < 0 < a. 

n-*« 
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To prove (a) , it may be noticed that there exists a function, F(x ) , non-tlccreasing 
and continuous on the right, such that Urn. at each point of con- 

continuity of F(x). But F(x) must be a distribution function, Indml, we 
have for a: > 0 


(2) F(-x) + 1 - f(*) < M(x-). 

Now from (1), putting t =* 0, we find that M{x) and consequently 
approach zero as a: + «> . This proves that F( — *) *= 0 and F(4- ») 1, 

To prove (b) , we notice first that the integral 

ip„(t) = f e'’* dF„(x) (n ■= 1, 2, ■ • •), 


is convergent for ( < | < a. This follows immediately from (1) by appljdng the 
method of integration by parts to the integrals 


r e"‘ dF„(x) and f* c"‘ dF„{x), 
'0 J-y 


which for any t in the interval ] f ] < a will bo seen to be bounded for all values of 

N. By the same argument, the relation lim =• 0, Ifj < a, which 

»•«*+» 

can be easily deduced from (1), together with (2) imply that the integral repre- 
senting <p{l) is convergent for j f | < a. 

Let now |3 be a positive number less than a and lot y be mich that ff < y < a. 
Let My be the least upper Ijound of iW(a:)c’'* for * > 0. Using the methml of 
integration by pai'ts and applying (1) we have for | f | < /3 


(3) 


dF„(x) = [1 - F„(N)] e"' + I e*'[l - F«(x)) dx 


S + My0 ® ™ . 

y - d 




We could prove easily that the some inequality is true for the integrals 
e*'dF„(x), e’“dF(x), j_\‘dF{x). 


Now let e be any positive number. Because of (3), we have 
W f ^‘dF„{x) <t, f e*‘dfi’(x) < e, 

for a sufficiently great number No , and uniformly with respect to n and I, when 

1 1 1 S 13. Clearly, No can be so chosen that F{x) is continuous for x » ± Nu . 
Then 


(5) lim 

n-+eo 

uniformly for | t j < ^. 



d“dF(z), 
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The relations (4) and (5) prove that <p(0 as n — > w, uniformly for 

i 1 < /3. But /3 can be chosen as near to a as we please; thus (c) is proved. 

Theorem 2. Let [F„(t)} he a sequence of d.f.’s and {ipn{t)\ die corresponding 
sequence of m.gf.’s. If exists for | ^ | < a, and if there exists a finite valued 
‘unction <p{t) defined for ] f [ < a, such that lim !f>„{t) = ipit) foT\t \ < a, then 


(a) lim M(a;)e"'® = O/or I <t < a; 

*"•*+00 

(b) there exists a df. F(x) such that lim Fnix) = F(_x) at each point of epntinuity 

n— ♦«» 

of F(x) 

(c) the m.g.f. of Fix) exists for | f | < a and is identically equal to <p(t) in this interval. 

(d) lim ipnit) = <fiit) uniformly in each interval ] i | < j3 < a, 

7l-*0O 

To prove (a) , let t be a number in the interval | i | < a, and let /3 be chosen so 
that \t \ < iS < a. Then, for x > 0, we have 

F„i-x) + 1 - F„(x) - [ ^dF„(u) + r°^dF„(u) 

< c-'’® n e-^'‘ dFniu) + e~^^ e'’“rfK(w) 

I, “ X 

^ + Vni0)]‘ 

Consequently 

M(x)c'''® < e<'‘'-'’'*l.u.b. Wn{-P) + <Pnm, 

n 

and since the sequences {¥’ 7 .(— /3)1 and {?>„(i9)} are convergent, and therefore 
bounded, it follows that flf (.'c)e'‘'® approaches zero as x — » + = 0 , 

To prove (b) we may notice that by the Ilelly selection principle we can 
choose a subsequence {F„;t(x)} which is convergent to some non-decreasing 
function Fix), at each point of continuity of Fix) Now the Theorem 1 together 
with (a) imply that Fix) is a d.f . and that the limit of the subsequence {¥>n*(0 } , 
namely <pii), must be identical, for | f j < a, Avith the m.g.f. of Fix). By the 
uniqueness property of a m g.f . ive know that Fix) is uniquely determined by 
(pit), and therefore it follows that every convergent subsequence of {i^nCx)) 
approaches the same limit Fix) at each point of continuity of Fix). This is, 
however, equivalent to the statement that the sequence (i^nCx)) itself converges 
to Fix) at each point of continuity of Fix). Thus (b) is proved. We see at 
once that (c) and (d) follow immediately from the Theorem 1 . 

Theorem 2 is of course similar to the Theorem 3 in the paper of Curtiss [1, 
p. 432]. The proof of (a), however, is not contained in his paper. From the 
Theorems 1 and 2 there follows immediately 
Theorem 3. Lei ( FJx) 1 be a seouence of d.f.’s, and let ! (Pnit) 1 be the corresvond- 
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ing sequence oj ?n.g.f.‘s, which arc all assumed to exist for ; / 1 < or. 7'hc neressar;/ 
and sufiaml conditions for the convergence of {ip„(0 j in the interval ( i | < a, are; 

(a) litn = 0, 1 i | < a 

n-t+oo 

(b) the sequence |i^n(-'s) 1 converges lo a d.f. F{x) al each point of continuity of F{x). 
Further t the m.g.f. of F{x) exists for | < | < a aiul is equal in flits inkrval lo the limit 
of the sequence {<pa(i) ] . 

In his paper CurUss given an example of a wcincnet' if'’n(.r)| of d.f. 'a which 
converges to a d.f. F{x), Avhilc the corresponding Bequcuce 1 of m.g.f.'B does 
not converge to the m.g.f. ¥>(0 of the d.f. f5’(r:), though Ijoth ¥>„(/), (rt = 1,2, •••), 
and <p{l) exist for all t. It may be easily proved by the direct method that in the 
case considered the condition, (a) of the Theorem 3 is not Hatislied. 

It is perhaps worth while to notice that the condition fa) of the Theorem 3 may- 
be expressed also as follows: 

lim s"* log il/(a;) < —«. 

X-^-fOD 

3. The multivariate case. For the sake of sinrjjlieity we .shall consider here 
the bivariate case only. The results obtained in this (*hai)ter, can be, however, 
easily extended to the ease when d.f.'s and m.g.f. ’s are dclined in the Huelidcan 
space of any finite numlicr of dimensions. 

Let (Xi , Xf) be a fandom vector var-iable in the two-dimensional Euelidoan 
space, and let F{xi , Xs) be its d.f. That is, for any real numbers n and x» , 

F(xi , Xt) s= PIA'i < xi , A'a ^ «j) . 

Let 


= PlXi < a;i) = F{Xi, + »), 

= P(.3rj < = F(d- w, Xi)‘j 

then Fi(.ti) and Fi{xi) are called the marginal d.f.’s of Xj and Zj. respectively. 
The m.g.f, ’s of the d.t.’.s F(xi. , si), Fxixi) and F 2 (x 2 ) are defined by the equations; 

¥>(<1, k) = r'° £*.'.+-.«. dFixx,xf} 

U—ioQ V— aO 

Viik)) = £!(/'") = e*‘'< dP.ix,), (i « 1, 2), 

in which the integi-als arc assumed to converge in some neighborhood of the 
origin. It is easy to sec that px(ij) = , 0) and <pi{k) = ,,(0, £,) . 

TmioMM 4. Lei p{k, k) and k) be the m.gf’s of d.ffs F{xi, xf) and 
F*-{xi , xa) respectively. If , 4 ) and , k) exist and are equal in some 
neighborhood of the origin \li[ < on, (f = I, 2), then F(xi , xj) = F*ixi , xf) 
identically. 

To prove this theorem, let us introduce two random vector variables (Xi , Xf) 
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and (Xt , X*) of which the d f .’s are respectively F and F*. Consider now two 
random valuables 

Z = Xii, + X^k , Z* = XUi + xh , 

where ii and <3 denote two real numbers not both i;cro. If <p(l) and are 
respectively the m.g.f.’s of Z and Z*, wc have 

<p(i) = <p{tk , tk), 

Consequently <p{t) = provided that | W, | < a, , (f = 1, 2). It follows from 
the uniqueness property of the m.g.f. in the univariate case that the di.'s of 
Z and Z* must be identical. Now, according to a theorem due to Cramdr 
[3, p 105], if the d.f.’s of Z and Z* eomcide for all pairs of values {h , such that 
I 1 + 1 I 0, the d.f.’s F and F* must be identical. It may be worth while to 
reproduce here Cramer’s proof. Let^(<i, and fc) = 

jgi^gUx;i,+y*is)) characteristic functions of F and F* respectively. 

Then yp{Ui , ik) and yp^{Ui , tk) are the characteristic functions of Z and Z* 
respectively. Since Z and Z* have the same d.f.’s, it follows that ik) = 
yp*{th , tk) for all values of t. Putting t — I, we find that , k) = , k) 

if I h I + I ia 1 5^ 0. For h ~ ^ = 0, ^(0, 0) = ^*(0, 0) = 1. Therefore , k) = 

1 k) identically, and since the characteristic function uniquely determines 
the d.f., it follows that the d.f. F and F* are identical. 

Theorem 5. Let lFn(iCi , ajs)} he a sequence of d.f.’s. Let Fi„ixi) and Fi^ixf) 
he respectively the marginal d.f, ’s determined hy F„ (xj , x^) . Let 

Mfx.) = l.u.b. {n„(-a;.) + 1 - FM\ 

n 

1, 2). If there exist positive numbers ai and such that far 
lim = 0, {i = 1, 2), 

I, -*+00 

and if {F„(c;i, aia)} converges on an everywhere dense set on the {xi, xf) plane, 
then: 

(a) there exists a df. F{xi , X 2 ) such that lim F„ixi , xf) = F(a;i , xf) at each point 
of continuity of F{x \ , xf), 

(b) there exist two positive numbers 5i and 82 , S^ < on , such that the m.gf.’s of 
F{xi , X 2 ) and F,fxi , Xs), say tp{k, k) and ipnik, k), exist for ( «; | < fi{ , (i = 1, 2), 

(c) lim Ipnik , k) = ipik , k) for j << j < , and uniformly in each two-dimensional 

n-+oo 

interval | | < pi < 8 t , {i = 1, 2). 

To prove (a), we notice that there obviously exists a function F(xi , Xj), con- 
tinous on the right with respect to each variable, satisfying the relation 

aV(xi , X 2) = F{xi , X 2 ) + Fixi , X2') — Fix[ , X 2 ) — F(xi , X2) > 0 
for X\ < xi , X 2 < X2 , and such that 


where x, > 0, (i = 

I ii I < oti 
( 6 ) 
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(7) lim Fni^u ®j) = F(x,,Xs) 

n— ♦» 

at each point of continuity of F(xi , Xi). We shall prove that F(x, , Xi) is a d.f. 
In fact, it IS easy to see that we have for a;, > 0, (i = 1,2), 

(8) F{-Xi , -%) < F(~xi , Xi) < Mi(xi-), F(xi , ~Xi) < ilfjfe-), 

1 - F{xi , xa) < ilfiCxi) + .Ifsfe). 


Now, according to (6), lim Miixt~) = lim MAx,) == 0, (i » 1, 2), therefore it 

follows from (8) that F{~k, -co) = !?(-«, xs) * F{xi, ~<v) « 0 and 
00 , + eo ) = 1, which proves that F(xi , Xi) is a (l.f. 

To prove (b), let <piAii) be the m.g.f, of the d.f, F,„(xi), [i = 1, 2), Let 
Fiixi) and /jfe) be the marginal d,f.’s determined by F(xi , xj) and let pAt;) be 
the m.g f . of FA^d , (* = 1 > 2) , 

Now let N' > N > 0 and 

Rn(N, N', k, k) - o^“^^^^“dF„(xi, xa) ~ £ £ e*'"-***** dFn(xi , x,) 


rW rH' c^' 

= + +i 

•1-N' J~S •‘ir O-V' J-N 


+ 


O-AT' 


A + jfj + 4* . 


Applying the Schwartz inequality to A , wo find 



But 

( 10 ) r f" e'^^‘'dFUxi,Xi) < r e^“^dFiM 

and similarly 

(11) r r e^‘‘^dF„ixx,x^) < r e^‘’**dF,„(x,). 

•Ik J-ii' 

Let e be any positive number' and y; a positive number less than «( , (z =» 1,2), 
It follows from the proof of the Theorem 1, taking into account (0), that the 
integrals representing ipi/ii) and pAQi (* — 1, 2), exist and are uniformly con- 
vergent with respect to n and U , when 1 <i 1 yi , (f « 1, 2). Consequently 
we have 

(12) f s*‘'dFin(Xi) < *, f e’>'<dFAXi) < e, (f = 1, 2), 

uniformly -with respect to n and when | < | 7 f , (t = 1, 2), provided that AT is 
sufficiently large, say N > No ■ Let us take = y</2, (f = 1,2). The integrals 
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representing ipxn{ti) and (i = 1, 2), are obviously uniformly bounded for all 
n and when | 4 1 < Tt j (^ = 1, 2), they are all less than some constant C. Con- 
sequently taking into account (9), (10), (11), and (12), we find 

7i < a/ Ge, 

uniformly with respect to n and U when | [ < /3i , (^ = 1, 2), provided that 

N' > N > No . Since the same inequality is true for U , It and It , we have 

(13) Rn{N, N', k.U) < WCt, 

uniformly with respect to n and 4 , when 1 4 1 < |8i , (f = 1, 2), provided N' > 
N > No ■ Hence the integral representing <p„{ti , 4) is uniformly convergent for 
1 4 I < /li ) and consequently convergent for | 4' | < ot{/2, {i = 1, 2), since /3. 
can be chosen as near to a{/2 as we please. 

Similarly, using (12), we could find 

(14) R(N, N\ 4 , 4) < 4V^, \k\<&.,N'>N>No 

where 

R{N,W,kM)^ f"' f"' - f f' dF(xi , X 2 ). 

This proves, in turn, that the integral representing <p(<i , 4) is uniformly con- 
vergent for 1 4 1 < /?< and convergent for | 4 1 < a,/2, (i = 1, 2). Thus (b) is 
proved with dt = a(/2, (i = 1, 2). 

To prove (c), let y' — > -f “o and iV = iVo in (13) and (14). We obtain 
(16) Rn{No , + «>, < 1 , 4) ^ 4-%/^, R{No , + '») 4 ( 4) ^ 4\/Ce 
uniformly with respect to n and 4 when | 4 | < . 

Clearly, No can be chosen so that Fi(xi) and Foixi) ai'e continuous for Xj = 
X 2 = ±i'7o. Then 

pNo pi^o f*^o r^o 

(16) lim / / e*^‘^+"^‘‘dF„(xi,X 2 ) = X 2 ), 

n-^oo *^^0 

uniformly for 1 4 1 < |3i , (i = 1, 2). 

The relations (15) and (16) prove that 

lim ipnik, 4) = ¥>(4j 4), 

uniformly for 1 4 1 /3i , (^ = !> 2). The ordinary coftvergence obviously holds 

for 1 4' 1 < 0 : 1 / 2 , (i = 1, 2). 

It follows from the above proof, which refers to the bivariate case, that we 
may take 5, = aj/2, (i = 1, 2), in (b) and (c). 

The existence of the corresponding numbers 6; , 3, ■< a, , (t = 1, 2, • • • , ?c), 
in the /c-vai’iate case can be easily established by the repeated application of the 
Schwartz inequality. 
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Theokem 0. Let <pn{li , Id, K(xi , xd, KnUd (iml M.ixd, a = 1, 2), 
have the same meanina as in the. Theorem 6. If ipn{k , k) exist for | ^ j < at , 
{i = 1,2), and if there exists a finite valued funelion ,p{ti , h) defined for ! /, | < a, , 
such that lim (pfiti , k) = <p{h , 6), 1 /. | < «; , 

n-4re 

then 

(ii.) hm M,{x,) c"''"’ = 0 for \l,\<a,, (t = 1, 2), 

(b) there exists a df. F(:ti , r-j), such that lim F„(,Xy , Xj) ~ Ffxj , Xj) at each point of 
coiilinuily of F{xi , xf), 

(c) the n.gf. of F(xi , Xj) exists for | <,-l < a. and is idmlkally equal lo (fi{ti , h) for 
1 <, I < a,' , (t = 1, 2), 

(d) lim y>„{k , k) = f>{k , tfj uniformly for 1 ii | < /9( < a, , (i 1, 2). 

H“*a6 

To prove (a), it is sulficient to notioe that ¥>i,(ij) == (finik , 0) and scanCia) = 
vfiO, k)' Consequently we have 

lim = <p{k , 0), lim ^„(<s) = p(0, k), 1 if I < “<> (» = Ij 2). 

Therefore (a) follows immediately from Theorem 2. 

To prove (b), we may notice that according to the Holly principle of selection 
applied to the sequence {F„(xi , Xj) 1 , there exisls a Biibseciuence {F„j(xi , Xj) ) , 
selected from the sequence {F„(xi , x*)! which is convergent to some function 
F(xi , Xj) continuous on the right and with non-ncgativo second difference. 
But F(xt , Xs) must be a d.f. according to the Theorem 6, since lire relation (0) is 
satisfied by the sequence {Fn^Cxi, Xi)l. Moreover, the limit of tlio scquonce 
[‘Pnik , h) 1 , namely <p{k , ta), when considered in a sufficiently small neighborhood 
of the origin, is the ra.g.f. of F(xi , xf). Since the d.f. la uniquely determined by 
its m.g.f., it follows that every convergent subsequcnco of [F„(xi, .xn)] con- 
verges to the same limit F(xi , xj) at each point of continuity of F(xi , xj) . Tliis 
is, however, the same as to say that the sequence {F„(xi , Xs) ) itself converges to 
F(xi , Xj) at each point of continuity of F(xi , Xj). 

To prove (c) , we have to show that the m.g.f. of F(xi , Xs) , say ip*(h , k ) , exists 
forli,-| < «, andisequalto h), 1 h 1 = 1,2). (We have proved that 

, k) = <p{k , k) only for sufficiently small values of ( k j and | fs |)- The 
existence of , k) for \Li\ < «i,{i = 1, 2). can ho oiwily established by the 
method used by Curtiss {1, p. 433]. Suppose indeed that p*(h , /,) does not 
exist at some point (ij , 4), whore | <* | < «i , (f = 1, 2). That means that we 
can find a positive number N such that 

(17) r f''e'''‘-'-'5^‘dFx,,X8) >p(tS,i$) 

•/-N J-if 

Since lim Fn(x] , xfi = F(xi , xs) at all points of continuity of F(xi , xj), and since 
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N can be so chosen that the marginal d.f.’s Fiixi) and F 2 ix 2 ) aie continuous for 
x\ = X 2 == ±N, it follows that 

(18) lim f"" e‘°^^'-‘°^*dF,(xi,Xi) = e‘°^^'^'‘‘^^dF(xi, X 2 ). 

71“>W3 J—Tf j—]/ j—jif J— ^ 

The formulas (17) and (18) give lim > (p(it , l^}, mIhcIi is impossible 

n~^oo 

because lim , U) = <p(h , k) for | /, | < a, , (t = 1, 2), 

n-*eo 

To prove that (p{ti , U) = <p*{k , k) for | i, | < a;, (z = 1, 2), let {h , k) denote 
a fixed point such that | i,' | < a , , (z = 1, 2) . Clearly, tpniik , ^^ 2 ) , (?/' = 1 , 2, ■ ■ • ) , 
and , << 2 ), considered as functions of the variable i, are m.g.f.’s provided that 
\tk \ < a, , (z = 1 , 2) . (See first part of proof of Theorem 4) . Now, according 
to Theorem 2, the limit of the sequence \<p„itti , ih ) } , namely tp{Ui ,ik),\ ih 1 < a. , 
(z - 1, 2), is also a m g.f Since <p{lk , tk) = , Ik) m a sufficiently small 

interval containing the point i = 0, it follows from the uniqueness property of the 
m g.f. in the univaiiate case that (pith , l.k) = <p*iUi , tk) identically for \ik\ < ai , 
(z = 1, 2). Putting { = 1, we find (p{h, k) - k), | i. | < a., = 1) 2). 

Thus (c) is completely proved. 

To prove (d), it is sufficient to notice that the sequence {pn(h ,k)]is uniformly 
continuous in each two-dimensional interval | ^, | < (9, < a, , (z = 1, 2), (that 
is, for any e > 0, there exists a positive number 5 = 5(e) such that 

1 Vn{h , k) — <Pn{tl ) fc) I < s 
if 

1 1 < 5, 1 I < I i'! 1 < (z = 1,2), (n = 1, 2, -.■)). 

Consequently, the sequence [pn{k , ^ 2 )) which is convergent for | | < A' , must 
be uniformly convergent if | i!; [ < /3i- , (z = 1, 2). 
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A GENERALIZATION OF TSHEBYSHEV’S INEQUALITY TO TWO 

DIMENSIONS 

By Z. W. Bihnbaum, J. Raymond, and II. S. Zuckerman 
University of Washington 

1. Let Zi , Za , ■ • • , X„ be independent random variables with expectations 
E{Xf) = Oj and variances a“(Zy) = <r) for j = 1, 2, < • ■ , n. The question 

may be asked : What is the upper bound for the probability 

that the point (Zi , Za , • • • , Zn) does not fall inside of the ellipsoid 

LM . IP 

t=i ti 

For ra = 1 the answer to this question is given by Tshebyshev’s inequality 


( 1 . 1 ) 


RZ- ■ 

L 


o\X) 


which can not be improved without further assumptions. By a trivial generali- 
zation of the argument leading to (l.I) one can prove the inequality 


(1.2) 


\)-i If ) ,-i {/ 


for any integer n. This inequality, however, can be improved for n > 2, In 
particular, for n = 2, the following theorem will be proved: 

Theorem 1,1. LetX and Y le independent random variables, mth expectaiions 
E{X) = Zo , E(Y) = Yo and variances ax , o-R Then, for any s > 0, I > 0 


s 

<TX 


such that ^ ^ we have 


(1.3) 

where 


P 


■(Z-Zo)V,_(Y- Yo)*^, 

-Q I JO — ^ 




B(fi, 0 


(1.4) L{s, i) = 


1 if5 + ¥>i 


2 3 2 

g'jc I q'y _ cx 

• JO _0 


s'* 


/ 2 3 \ 

f ax I a,. \ 


1 - 


a 

Ox 


2 2 2 2 
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2 2 

For any given o-x, cr , s > 0, i > 0 such that^ < ^ there exist independent random 

5 V 

variables X and Y with the variances <rx , <ry , such that the equality sign is true in 
(1,S). 

This theorem is a special case of the more general statement; 

Theobbm 1.2. Let W, Z he independent random variables such that 


(1.4) 

(1.5) 
(l.G) 


P{W < 0) = P{Z < 0) = 0, 
E(^V) = X, E{Z) = p, 
h p. 


Then, for any t > 0, we have 


(1.7) 

where 


(1.8) M{i) = 


P(W + Z>t) < Mit) 

1 if i < X + M 

X-t-/x X i ~ (\ -Y p) p 


t 


t t t 

if X + M^i^i(X + 2;r + "s/x^ -h 4 m®) 


j if i(X + 2 m + Vx® + 4m®) < t 

For any given X > 0, m > 0, X < Mi (^nd t > 0, there exist independent variables 
W, Z such that { 1 4) awcZ {1 .6) are fulfilled and that the equality sign is true in (1 .7 ) . 
Theorem 1.1 is obtained from Theorem 1.2 by ivriting 


W = 


(X - 


Z = 


(Y - 7„)= 
t® 


t = 1. 


2. Before proving Theorem 1 .2 we shall derive two lemmas, The first of these 
lemmas deals with more than one variable. Since its proof for general m does not 
present any additional difficulties it will be stated and proven for any number 
TO > 1 of variables, although in the proof of Theorem 1.2 it will be used only 
for ?n = 1. 

Lemma 1 . Let U, Vi , V 2 , ■ ■ • , Vm he independent discrete random variables 
with only non-negative possible values, and let U have a probability distribution 
with the possible values 0 < Ui < Ut < ■ • • < U,, and the probabilities PiUi) — ri 
for 2 = 1 , 2, • • • , n. T'Fe consider any three possible values U j , Uk , XJiof XJ such 
that 

0 < Uj < Uk < U,, 

with the corresponding probabilities r, ,rh,Ti . Then, for any t > 0, there exists a 

random variable V with the same distribution as U except that the probabilities 
Vi ,rh,ri of Uj ,Uk ,Ui are replaced by r, ,r'k , r'l such that 
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(2.1) E{V') = B{U) 

( 2 . 2 ) one of r',- , , r'l is zero 

(2.3) P{U' + Vi 4- • • • + 7m > 0 > PiU + 7, + ■ ■ • + Fm > /). 

Proof: let r', , i* , r'l be written 

(2.4) r'f = rj-f- a/3, ?■* = r* — /3, rj = n + (1 — a)/3. 

For any a, /3 we then have 

J"! + rt + r| == r, 4- ?■* + n . 

Choosing 

(2.5) ci={Ui~ U,)/iUi - Us) 
we obtain the equality 

UsT, -4 Uoi 4- Uir'i = Ufj 4- U,r, + I7,r, 

,so that (2.1) is troe for any p. 

We obviously have 

p (y + £ 7. > A - E P(C/ = U.)-P (i: 7, > / ~ u) 

(2.G) ^ ^7' 

-- tr,p(f:v, >i-Us 

The variable U' has the aame possible values f/< as the variable U. Writing 
P(U' = U<) = r'{ , for i = 1, 2, • • • , n, wo also have 

(2.7) ^ ^ ^ (S ^ ~ ^‘) ■ 

Prom (2.0), (2.7), and (2.4) wo obtain 

P (c^' + ij 7. > - P ^57 + £ 7. > 

(2.8) = «/3P (e V,>i - - PP (p^V. > t ~ 

+ (1 - «)/3P 7. > t - 7,) , 

For a determined by (2.6), the right-hand side of (2.8) is of the form C/3, and 
will be positive if sign /3 = sign C. If sign C is positive, wo choose ;8 *= r* and 
have, from (2.4), rl = 0, and, from (2.8), the inequality (2.3). If sign C is 

negative, we set /3 ■= Max which leads to either rj « 0 or r't =• 

0, and again to (2.3). In both cases we have kept the probabilities rj , r* , r'l 
non-negative as they should be. 
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Lemma 2. Let the discrete random variable XJ have only the two non-negative 
values XJi < Vi , with the corresponding prohdbihties ri , ra , and let the a given 
number such that 

(2.9) EiU) <t<Ui. 

Then there exists a number a > 0 such that the random variable U' with the possible 
values 

(2.91) C7( = l/x + a 

U'i = t 

and the corresponding probabilities rx ,ri , has the properties 

(2.92) 0 <V'i < Vi 

(2.93) EiV') = E{V). 

Proof: to have (2.91) and (2.93) it is sufficient to choose 

fiiVi - t) 

CL = 

n 

Then (2.92) is also fulfilled since, m view of (2 9), we have 

Ui = -b ViVi- rji ^ E{V) - Vit ^ t - rjt ^ ^ ^ 

Ti n ” n ’ 

and obviously a > 0 and hence V'l > Vi > 0. 

3. Theorem 1 will first be proven under the assumption that W and Z are 
discrete random variables, each with a finite number of non-negative possible 
values. By repeatedly applying Lemma 1 with m = 1, V = W, Vi = Z, we 
reduce the number of possible values of W which have non-zero probabilities 
to two, and denote those possible values by Wi < W 2 , and their probabilities 
by Pi and p 2 = 1 — pi . Then, applying Lemma 1 to the case m = 1, V = Z, 
Ti = W, we similarly reduce the possible values of Z to the two non-negative 
values Zi< Zi , and denote the corresponding probabilities by gi and ga = 1 — gi . 
Throughout all these steps the expectations E(W) = ^ and E(Z) = p remain 
unchanged, and P(TF + Z > i) is not decreased. 

For i < X + /i, inequality (1.3) is obviously true, and equality is attained for 
W having the only possible value X with probability 1 and Z having the only 
possible value p Avith probability 1. 

For the remainder of the proof we assume i > \ -]- p. Wo then have 

i>X-t-M^X + Zi> Wx + Zx . 

If Wi > t, we may replace it by 1^2 = i according to Lemma 2. Similarly, if 
Zi > t, we may replace it by Zi = t. The probability P{W Z > t) is not 
decreased in this process. We may thus assume, without loss of generality, that 

TF 2 ^ tf Zi ^ t. 
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The joint distribution of {W, Z) has now the possible values represented by the 
four points (Wi , Zi), (TFi . Z^}, (Tfa , ^i), (H'a , Z,). The coordinates of these 
four points and their probabilities fulfill the following conditions 

(3.1) 0 < IFi < X < 1^2 < 0 <Zi< ii<Z2<t 

(3.2) pi + pa = gi + == 1 

(3.3) piWi -h piWi = X, qyZi + q^Zi ~ n. 

In view of (3.1), the point (W \ , Z^ always lies below tlic line W Z i. The 
other points may or may not lie below that line. Accordingly, wc distinguisli 
the cases listed in Table I. These cleaily include all possible cases since (IFs , Z^ 
can not be below the line W + Z - t without all the other points being below 
that line. 

In case V we have P{W + 2 > t) = 0. 

For the discussion of the remaining cases rve note the following relationships 
which follow from (3.2) and (3.3). 



TABLE I 


Cve 

FoiQtdbolowlioa 

Fuinta notboloiv lino 

w + z-t 

I 


(lV,,Zi), (IV, , Z,), (IV,, Z,) 
(IV., Z,), (IV,, Z.) 

II 

(IFi,^.), (IV., Zi) 1 

III 

iWl.Zl), (.Wx,Zy) 

(IV,, Z,), (IV,, Z,) 

IV 

(IV,, Zi), (IV,, Zi). (1F,,Z,) 

(IV. , Zy) 

V 

(fV, , Zy), (IV, , Zy), (IV, , Zy), iWy, Zy) 

noao 


Pi 


IFj — X 
W2 - Wy 


W2-Wy' 


9i 


Zi — p, 

Z2 - Zi’ 


gs = 


P ~ Z\ 
Z^ — Z\ 


In case I we have ' 

(3.41) Tfi + 2i < t, Wi + Zy> t, Wy + Z2> t, ■W2 + Z2> t, 
P = P{W + Z > t) = pagi + Pigs + pjgj = 1 - p,2i 


_ , _ Wi — X Zt — p 
W,~Wl'Zi~ Zy 

Since P is a decreasing function of Wy and Zy , we replace Wy and Zy by the 
smallest values compatible with (3.41), namely Wy = t — Zt , Zy t ~ Wi, 
and obtam 


P < 1 — ~ X)(Zi p) 

(Wi + Z2~ ir 


P(IF2, Zi). 
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For fixed Z 2 , R{W 2 , Zi) haa a minimum at TFs = Z 2 2\ — t and no other 
extremum, hence it assumes its maximum at one or both of the end-pointa of the 
interval for Wi which, by (3.1) and (3.41), is 

i - Zt<W2<L 

In view of (3.1) we also have t — n < t — Zt , and hence 
P < Max [R{t — II, Zi), R(t, Z 2 )]. 

We find 


and 


R(t - IX, Z 2 ) = 1- ^ < 1 - ' ^ 

Z2 — n t — IX 



R(t, Z 2 ) = 1- = R^^\Z2). 

Z 2 

This last expression has a minimum for Z 2 = 2 and no other extremum, hence it 
assumes its maximum at the ends of the interval for Z 2 which, by (3.41) and 
(3.1), is 


t- Wl<Z2< t. 

From (3.1) we also have t — \ < t — Wi f and thus 

R(t, W 2 ) < Max [i2'“(i - X), 5;'«(i)] = Max - —1 • 

LI — X t , J 

Finally, we obtain 


P < Max 


Each of the values P = 


X 

J - m’ 


X + M 


< - X’ t 

n X + a Xa 


Xa 

Tj “ 


f — a ’ i — X ’ t 
as is shown by the probability distributions 


can be attained in case I, 


TFi — 0, W 2 — t — a, Zi — IX, Z 2 — t. 


(3.42) 


(3.43) 


(3.44) 


Pi = 1- 


, P2 = , 

t fJf t fJi 

W \ = X, TFj = Z\ = 0, Z 2 = t — X, 


(Zi =1, (Za = 0; 


1, 

p^ - 

0, gi = 


Wi 

- 0, 

11 

Zi = 0, 


X 

X 


■- 1 ' 

t ’ 

Pa--, 

gi = 1 


3a 


M 


lE - X’ 


t' 


Sa 


t’ 
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In case 11 we have 

Wi Zi <. t, Wi + < <, \Vi + ^2 > Tf^a *1~ -^2 ^ i, 

(3.51) u- Z, 

P = PiW + Z>i)=^ Ma + Ma = . 

Aia “■ Ari 

This is a decreasing function of Zi as well as of Zt and hence takes its maximum 
for the smallest values of Zi and Zi compatible with (3.1) and (3.5), that is for 
Zi = 0^ Zi = t — X. Wg thus obtain 

P < 


t- X' 

This upper bound can be attained in ease TI, as may be seen from the distribution 
IVi ^ Xj T1^2 — X, Zi — 0, Zi — i “ X, 


(3.62) 


Pi = 1. Pa = i ?i == 1 


?2 - 


t-x’ ““ t~\‘ 

Case III is synimetricd with case II and leads to the inequality 

X 


P < 


t 


In case IV we have 

TFi + < i, Wi Zi <. I, Wi Zi <i I, TVs + ifj > f, 

®'"' P - !>(»' + Z > 0 - p.® - 


(ir; ~ Wi)(Zi ~ Zx) ■ 

The right hand aide is a decreasing function of each of the variables IVi , ]Vi , 
Zi, Zi, and henco is increased by ohosing for these variables the smallest values 
compatible with (3.61), i.e. 

(3.62) Wi== Zi==0, Wi + Zi^ t 

for which we obtain 

X 1 


P < 


Wit -Wt 


= p'^dF,). 


Since P'“’(W 2 ) has a minimum at TFj = ^ and no other exti'omum, it attains its 

largest value at one of the end points of the interval for IFj which, by (3.1), 
(3.61) and (3.62), is 

X<Wi<t~^l. 

This leads to 

X 


P < Max [P‘«(X), P®(i - ^)] = Max 


R 

.r- X ’ 


t — gj 
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The upper bounds - ^ ~ ^ — , respectively, are attained in case IV for the 

C — A t jl 

probability distribution 


(3.63) 


and 


Wi — 0, Wi — X, Zi — 0, Zi = t ~ 

ft-o, ft = l, = 

IVi = 0, W 2 — t — n, Zi = 0, Zi — n, 


Pi = 1 - 


t - Al’ 


V2 = 


t - Ai 


qx = 0, 52 = 1. 


From the preceding discussion we conclude that P = P(T'7 + Z > i) always 
fulfills the inequality 


P < Max 


X /A 

’ t - X ’ 


X + AA _ Xaa 


= C /(0 


for Z > X + M- Since we have “assumed X < aa> we have 


Z > X + aa, and therefore 
t7(Z) = Max [ 
It is easily verified that 


L< - X’ 


X + A» Xaa 

' t IF 


t — n Z — X 
for Z > X + ju. 


for 


^ Y '- ~ ^ X -f- /A < z < I (x + 2 aa + "s/x* -l- ifF) 


and 

^ for HX + 2 /a + V>^TV) < t 

so that we have U{t) = M(Z) as defined in (1.8). For given X, /a and any Z > X 
+ aA) the equality P = --r is fulfilled for the distributions (3.43), (3 52) and 

t — A 

(3. 63), whUo the equality P = ^ for the distiibution (3.44). 

t t 

This completes the proof of Theorem 1.2 for discrete random vaiiables. If 
17 and Z ai'e independent random variables with the cumulative probability 
functions P (17 < w) = F{w) andP(Z < «) = (7(z) , then each of these cumulative 
probability functions can be uniformly approximated by a step function with a 
fimte number of steps, that is by the cumulative probability function of a discrete 
random variable with a finite number of possible values. Since for such variables 
Theorem 1.2 is proven, it also is true for the general random variables W and Z. 
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4. An attempt to extend tlie method used in proving Theorem 1.2 to more 
than two variables leads to arguments of a prohibitive length. It ia possible, 
however, to obtain corollaries of Theorems 1.1 and 1.2 which lead to an improve- 
ment of inequality (1.2) for n variables. 

Corollary 21 Lei Xi , Xi ,••• ■, be independent random variables luith 
expectations EiX,) = ey and variances <r^(.X,) = a) . Then, J'or any t; > 0, 
j = 1, 2, • • ■ , n, and any m such that 


Sr = i:^ < z 


“t t‘. 


j— m+1 




We have the inequality 


[l if 




1 1 


+ 2s 


P 


(Yy-ey)° 


> l)^ 


V ii 
1 

J—t tj 



_ ^ t - ( 2i + 2s) 

- 2i 

if Si + 28<<< H2i + 22s+ V if +ls?] 

- 2i-2, il i [2s + 22^ -I- \/v»+'4v=] < t, 


This corollary is a special case of the following corollary to Tlicorom 1 .2 
Corollary 2,2. Let Wi, Trs,'-*,17„ be indepcndeiU random variables 
such that P{W y < 0) - 0 for j = 1,2, • • • , n, and let m he any integer such that 

tfi n 

Z E(IFy) == X, Z HWi) - yr, X < M. 


Then, for any i>Q,we have 

where M(t) is defined by {1 8). 

This corollEu-y follows immediately from Theorem 1.2 by writing 

m n 

w^ZWi, z = Z W). 

To obtain Corollary 2,1, one only has to write in Corollary 2.2 

Wj = 

a some additional assumptions are made on the expectations E(Wj) or on the 
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variances v, , the upper bounds in Corollaries 2.1 and 2.2 may be minimized by 

proper choice of m or of the . For example, if all the variances are equal 

2 2 2 2 
ffi = cr2 = • • • = a„ = IT 


and n is even, one obtains the inequality 

1 if i!^ < 




1 - 


t — na" 
n , 


if < ^^ < 


3 4- V5 


n(T 


n<^ ( ^ 1 nff°\ ., 


3 + ■\/5„ 2 ^ ^2 

: rKT <■ t . 





DISTRIBUTION OF THE SERIAL CORRELATION COEFFICIENT 
IN A CIRCULARLY CORRELATED UNIVERSE' 

By R. B. Leipnik 

Cowles Commission for Research in Economics 


1. Summary. It is desired to find an apjiroximate distribution of simple 

Klajj + . • . + ar-Ti , j. . c.i .1 

a — (r is an estimate of tlio serml corre- 

I 


form for tho statistic f = 


cA. + 


ktion coefficient p in a circular universe) in the case that p ^ 0 in the univci'se. 
Such a distribution is obtained by smoothing the joint characteristic function 
of the numerator and denominator of the expression for f . The first two mo- 
menta are calculated; from these f is seen to be a consistent estimate of p. A 
graph of this distribution for sample size T = 20 and various values of p is given. 

In addition, an appvoidmate distriliution for p = a;? 4- • ■ • + Xy is derived 
'which reduces to the exact (xM distribution if p = 0. From a formula wdvich 
yields all moments, it is concluded that, at least up to the degree of approxima- 
tion attained, p/T is an unbiased and consistent exUmatc of A. 

2. Several Avriters have investigated the temporally liomogeneous stochastic 
process defined by 

(1) a’l pxi-i — Zi , 2, |p|<l^ 

where the s, are unobservable disturbances, normally and independently dis- 
tributed with moan zero and variance A, the Sc are observed variatos, and the 
"first observation” xo has a normal distribution Avith mean zero and such a 
variance A that all later observations have tho same variance. Thus avo have 

2 

( 2 ) 


3 

C j "" 


and the joint distribution of a sample of T -h 1 successive values is 


(3) 


. (1 - p®)! r 1.0 

Xi, ' ' • , Xt) — ^2^^!^>'/s+i/ 3 • exp {aio -f s 


Xr 


— 2p(XoXi 4- • • • + Oii-iOlr) + (1 -f p*)(xi 4- • • • + Xy_i)) ^ . 

Koopmans ([1], formula 96), by smoothing characteristic values, has obtained 
an approximation to the distribution of the serial correlation coefficient r for the 
case p = 0, where 


(4) 


xiiici 4~ • • • 4- xt~iXt 

T — 3 , I j , 

Xq "T • • ‘ 4~ Xj’ 


* Cowles Commission Papers, New Series, No. 21. 
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This result is expressed in. the form of a definite integral whose evaluation 
has not so fai' been effected. 

By considei’ing the related circular stochastic process, where Xo is defined to 
be the same observation as Xt , great simplification is obtained. Here the 
joint distribution of cci , aia , ■ • • , Xj. is , 


f(p:i , , 


\ ^(p) 

. exp 




(6) 


L 2<r8(l -- p2) 

( (1 + p*)(®i + ‘ + ^V) — 2p(xiXi + 


+ 


iCraii)} J 


Up) = 


1 r 

I — P 

(1 - p “)«2 ■ 


By smoothing characteristic values, Koopmans ([1], formula 92) found a definite 
integral and Dixon ([2], 3.22) an explicit expression for an approximate distribu- 
tion of the circular serial correlation coefficient f, for the case p = 0, where 

Dixon’s distribution Raif) has the simple form 

r (I + 1) 

(7) (1 - f’)"-'. 

(2 + 2) 

Rubin [3] proved these results to be equivalent. On the other hand, R. L. 
Anderson [4] obtained the exact distribution of f in the case p = 0. Madow [6] 
extended this result to the case p 0, using a property of sufficient statistics 
also noted by Koopmans ([1], p. 17) in connection with the non-circular problem. 

It would, however, be difficult to find percentile points or moments from 
Madow’s exact distribution. An approximate distribution of f for p 5^ 0, 
together with its moments, analogous to Dixon-Koopmans’ for p = 0, should 
therefore be of interest. The purpose of this paper is to obtain such a distribu- 
tion from the circular universe (5). The statistic f is shown to be a consistent 
estimate of p within the limits imposed by the approximation. In addition, an 
approximate distribution for p = a:? -f- ■ ■ ■ -b x% in the case p 9 ^ 0 (which 
reduces to the exact chi-squared distribution when p = 0) is derived, together 
with all of its moments. 


’ 3. We begin by asking about an approximate joint distribution of p and 5 de- 

fined by 


(8) 


2 I I 2 

p = Xi + •■■ + Xt 
q = X 1 X 2 ■ 
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Defining <fi(u, v) as tlio expectation of exp[i(«73 + we have 
^(p) 


<j>(u, v) = 


(9) 


i2TraY’‘ 




dxr. 


On integration, wc find 

(10) ^{u, v) = X(p)[/(n, t))r‘ 

where A(u, v) is the deteiminant of the matrix associated with the quadratic 
form within the curly brackets in (9). A{u, v) is a circulant; its value as deter- 
mined from the chculant formula ([2], p. 123) is 

(11) Aiu, v) = Jl(y-2z cos 
where y and « are defined by 


V = 


I-YP 


2ia% 


( 12 ) 


1 


-1- iao. 


To get an approximation Z.{u, v) to A(w, v) we smooth log A{u, w) by Koopmans’ 
method, We have 

(13) log A («, t*) “ S log (y ~ 2z cos . 

We define J(u, v) through 

(14) log i(«, «>) = j[ log - 2z COB <11 

in which the summation in (13) is replaced by integration. The integral in (14) 
is easily evaluated ([6], p. 66) giving 


(15) 


A(u. rt - ( y -+.y|---^)'. 


Incidentally, had we used g* = XiXt,+i -h in place of = § in (9), 
we would have obtained the same expression (15) for Z(u, v). 

Setting ^(u, v) = X(p)[I(u, v)]~* we may determine X(p) by the requirement 
^(0,0) = 1. A simple calculation yields the result X(p) « (1 — • (Note 


that =s 1 — p*" is close to 1 for large values of T). Our result for ^(u, v) 


x(p) 
appears as 
(16) 




y + Vy’ — 4z*'' 


-(Tli) 
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The approximate joint distribution of p and q may be written as the double 
Fourier integral 


dv 


(17) D(p, -q) = ^ + m ~ A ^-) '' 

which we evaluate ([7], 576.3, 914.3) by changing integration variables from 
u, V to y, z and integrating out y and z successively. We obtain finally 


5(p, ?) = 5 . 


T [2c\l ~ 

U»)r [I + 0 


P 


(18) 


\p^ - tf 


•exp 




f (1 + P^)p ~ 2pg} 


2cr^(l — p“) 

Changing variables from p,q = pftop,f, we obtain for F(p, f) , the approximate 
joint distribution of p and f, the expression 

|>(y,, f) = 


(19) 


r(4)r 


0 + 5 ) 


•exp 


V 


fl + p' 


-V)]. 


L 2a*(l - p^) 

We could also have derived (19), following Madow, by noting that for p = 0, 
p and f are independently distributed, p having the chi-squared distribution and f 
having approximately the Dixon distribution (7), and that p and f are sufficient 
statistics for the estimation of p and <r“. 

4. The approximate marginal distribution tipif) of f is obtained by an easy 
integration from (19) 


%(j) - f P{p,f)dp = 

•>0 ' / 


2/1 2',i-Cr/2) 

p )\ 


r(i)r 


(Li) 


(1 - f^) 


-2\3'/2-l 


( 20 ) 


= 


• jf p«-' exp [- (1 + p‘ - 2pf|] dp , 

(5 + D 


(1 - ry^-^l + P® - 2pf)-’''^ 


Our notation is consistent since iSp(f) indeed reduces to the Dixon distribution 
for p — 0. Rp{f) has a maximum when 




{(1 + p=)(r - 1) - Vt(t - 2)(i - p2)2 + (1 -f p2)“}. 


2p{T - 2) 
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A little manipulation shows that 1 > | fma* I > I P i ~ P 

totically A graph (Fig. 1) of R,{f) for T = 20, p = 0, .2, .5, .7, .9 is appended 
from which it is seen that for | p | near 1, tlie di,stribution becomes highly con- 
centrated about fmw ■ On differentiating with respect to p and eliminating, 
the envelope of the Ap(f) is seen to be 



Idc. 1 Graph of the Distribution of the Serial Correlation Coafliciont in a Circular 

UnivorsQ, for T => 20 


6. Before evaluating the moments of Rpif) we "will pause to obtain the ap- 
uroximate marginal distribution Ppip) of p, and its moments. We write 

(2h 


If we define Z,(s), the Bessel function of order v and purely imaginai'y 
ai-gument by 
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n-o n\T{v + re + 1) ’ 

we obtain ([8], p. 79), if p 5 ^ 0 


and if p = 0 


(as) f CP [- i (;-4D] . 

and if p = 0 

on performing the integration indicated in (21). P„(p) coincides with the 
exact distribution Paip). An expression covering all moments of P,{p) is 
obtained from (16) by setting v = Q, differentiating, and setting re = 0. We have 

(25) . Iv + -\ 

<i>(u, 0) = x(p) y p^i ) . 


hence 




0) = (-2vy(l - p“)- 


i» / 2/ + 


1 - 0“ 


From (26), we readily find 




(27) %] = 2V, -^[ 1 ] = .^ 

Slp^\ = (Tcry + 2Tc* 

( 28 ) ' ~ py 

Thus the unbiased character of p/T as an estimate of is reflected in the an- 
proximate distribution, wliile (28), wliich shows that lim cf^/j. = 0, indicates 
that consistency is also reflected. 

6. We now calculate the momentsof EM . Interchanging the order of integra- 
tion in the expression for is justified by the uniform convergence, so v^e have 
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®[f*] = ^ I %.^]dp]df=jf l^jf^ f^P(p,f)df dp 

'2cr^(l - p 


T r2ff5(i - r 

l\i 


'’■+0 


„T/5-l 


exp exp (mf) dfj dp 


■where m is defined by 

(30) 

Defining G(?n) by 

(31) 

we have ((8], p. 79) 


m = 


pP 


<r>(l -p«)' 

G(m) - j (1 — exp (mf) df 


(32) 


0(m) 


-(:) 


-r/t 


Ir/tim) 




Differentiating each side of (32) k times, we find by (31) and (32) 


c/m*' 


G(m) = f*(l - exp (?«f) 


(33) 


df 

2Tii 




+ r 


l'\ dm* 


[m""'“/r/a(m)]. 


2; 


Using the identity ([8], p. 79) 


~ Y'm] = z“'/h.i(2) 


and changing the integration variable in (29) from p to m, wo obtain finally 


(34) 


For fc = 1, we have ([8], p. 386) 


(35) 


J5;[fl - 


1+f 


For k = 2, after some tedious calculation, we find 
(36) 


1 ^ p’r(r + 1) 


T + 2 ' (T + 2)(!r + 4) 


T + 2\} 


p^T(T - 2) 

(T + 2)(T + 4) 


I 



SERIAL correlation COEFFICIENT 


87 


We note that lim S{r) = p and lim sf = 0, so that at least to the extent of 

T-*«) 

approximation furnished by 2lp(f), f is a consistent estimate of p. 

The author wishes to express his gratitude to Dr. T. Koopmans, under whose 
kind direction tins paper was written. 
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CONCERNING THE EFFECT OF INTRACLASS CORRELATION ON 
CERTAIN SIGNIFICANCE TESTS 

Bt John E. Wamh 
Princeton University 

1. Summary. In practical application.'! it is frequently assumed that the 
values obtained by a sampling process arc independently dravn from the same 
normal population. Then confidence intervals and significance tests which were 
derived under the assumption of independence are applied using these values. 
Often the assumption of independence between the values may loo at best only 
approximately valid. For some cases, however, it may be permiasiblo to assume 
that the correlation between each two values is the same (intraclass correlation). 
The purpose of this paper is to investigate the effect of this intraelass correlation 
on the confidence coefficients and significance levels of several well known 
confidence intervals and significance teats which were derived under the assump- 
tion of independence, and to extend those considerations to the case of two 
sets of values. 

In the first part of the paper the relations given in Table I arc used to compute 
tables wliich show the effect of intraelass correlation on the confidence coefficients 
and significance levels of the confidence intervals and significance tests listed in 
Table II. The second part of the paper consists of the proofs of the relations 
given in Table I. 

2. Introduction. Let the n values a;i , . , ,x„ represent a single value of a 
normal multivariate population for which each of the n variables boa mean m 
variance c, and the correlation between each two variables is p. Those n 
values will be called a correlated “sample.’' The values Xi , • • • , ain and 
yi, ■ - ■ , j/m are said to represent two correlated “samples” if they have a normal 
multivariate distribution such that the x’b have moan g, variance o-®, correlation 
p, the y’s have mean /, variance cr'*, correlation p', and the correlation between 
each X and y is p”. This paper shows that several well knoAvn quantities ivhich 
have Student i, x^ or Snedecor F distributions when the values form random 
samples still have those same distributions for correlated "samples” if the quanti- 
ties are multiplied by suitable constant factors, where it is to be remembered 
that for normal populations a correlated “sample” is a random sample if and 
only if p = 0 and that two correlated “samples” roprosont two random samples 
if and only if p = p' = p" « 0. The quantities considei'ed and the ooiTosponding 

n m 

factors are listed in Table I, whore ='= 22 *</«■ and ® = 22 Va/m. Several oom- 

1 1 

monly used confidence intervals and significance tests based on these quantities 
and derived under the assumption of randomness ore considered, and tables are 
computed which, show how the confidence coefficients and significance levels of 
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these confidence intervals and significance tests vai’y if the values ai'c from 
correlated “samples” instead of random samples. Table II contains an outline 
of the confidence intervals and significance tests considered. It is found that 
these confidence coefficients and significance levels can change noticeably when a 
correlated “sample” is considered. This is pai’ticularly true for the Student 
i-test. For example, in one case it is found that if the sample size is 32 and the 
significance level is .05 when p = 0, then the significance level becomes .23 for 
p — .05. This lai’ge change in significance level for a small change m p is ex- 
plained by the factor given for the Student i-distnbution in Table I. This 
shows that test results which appear to be “significant” under the assumption of 
randomness ai'e not necessarily “significant” when correlation is present, even 
though the amount of correlation may be small. The effect of correlation on the 


TABLE I 


Quantity 

Bi&tnbution For 
Random Sample 

Factor Multiplying 
Statistic lor 
Correlated "Samples" 

(S — p) Va(a — 11 (£ — p) Vn(n — 1) 

Student t-diatribution 

Oiv-iCi) dl 

, / 1 - p 



y 1 + (n - Up 

S* 1 A , 

x*-diBtribution 
/«-.(x’) dx^ 

1 

1 — P 

1 

Snedecor F-distri- 
buUon 

dP 

1-p' 

1 - p 


and Snedecor F tests is not as gi'cat as for the Student i-test as can be seen from 
the factors given for the Snedecor F distributions in Table I. 

3. Effect of intraclass correlation. The relations stated in Table I will now 
be used to investigate the effect of intraclass correlation on the confidence co- 
efficients and significance levels of several common types of confidence intervals 
and significance tests which were derived under the assumption of random 
samples. The confidence intervals and significance tests considered are listed 
in Table II, where and 8'^ are defined in Table I. These particular confidence 
intervals and significance tests have the property that if a is the confidence 
coefficient of the confidence interval listed for a given statistic, then 1 — a is 
the significance level of the significance test listed for that statistic, this relation 
holding whether random samples or correlated “samples” are considered. For 
this reason the tables given in this section will be limited to confidence coeffi- 
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cients; the corresponding significance levels can be obtained by rising the above 
relation. 

o. Student t-didribuHon. If a random sample of size n. is drawn from a normal 
population with mean p and variance tr° (denoted by A^(a, u^)), a confidence 
interval for u with confidence coefficient e is given in Table II. If the n values 
form a correlated “sample”, however, it follows from Table I that the cor- 
responding confidence interval with coefficient e is 



i/i 


! + (»»- 1 )> 
n(n - 1)(1 - p) 




$ 4 " tiS 



u(n — lj(l 


~ p) ‘ 


TABLE II 


Stat- 

istic 

Para 

meter 

Exam- 

ined 

ConGdence Interval 
(CooGdence CoelTicient <) 

Significance Teat 
(Significance Level ** 1 — e) 

DcGnItlant of 

Cans Uinta 

i 

M 

Vn(n - 1) ^ 

\/n(n — 1) 

1«-m1 

^ l.S/\/n(n - 1) 

[ dn-iU) dt >=> , 

J-i, 

x’ 

O'* 

0 g T* ^ SVxi 

— S t/x< 

(r” 

[ a/n-iCxb dx* “ s 

J X. 

F 

(T^ 

ff'a 

0 S <rV<r'‘ S SyS'^F, 


f /is- i(F) dF - e 

J F, 


The confidence interval given in Table II can he rewritten as 




(n -'f)p 


1)(1 ~ p)’ 


where 


Hence if p < 0, a > e and the confidence coefficient of the confidence interval 
in Table II is greater than e. This means that the significance level of the 
corresponding significance test listed in Table II would bo less than 1 — e so 
that any test result which would bo significant for a random sample would also 
be significant for a correlated “sample” for which p < 0. If p > 0, however, 
« > a and the significance level of the test would be greater than 1 ~ Thus a 
test result which would be significant for a random sample need no longer be 
when p > 0. The effect of positive values of p upon the confidence coefficient 
a = a,(p. n) of-the confidence interval of Table 11 is given in Table III for the 
cages e = .96 and .99. Confidence intervals with unequal tails can be treated 
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in a similar manner. It is thus seen that the effect of correlation on the con- 
fidence coefficient increases with the sample size n, and that even a very small 
amount of correlation can cause a large change in a. For example, for samples 
of size 16 a correlation of p = .05 will change the significance level from .05 to 
.135; for samples of size 32 a correlation of p = .05 will change the significance 
level from .01 to .102, and from .05 to .23. 

Confidence intervals for m — a' are given by Theorem 5 of section 4. It is to be 
observed that if p = p' = p" and cr = <r' the confidence coefficients are inde- 
pendent of p and (T. If m = n, p = p', (T = a', p" = 0, however, the confidence 
coefficients of the confidence intervals for p ~ p! have the values a = aiip, n) 
given in Table III. 


TABLE III 
Values of otifpi n) 



0 

.05 

.1 

.2 

3 



rt 









.00 


.983 

.974 

■■ 

,944 

.920 

4 

.06 


.921 

.890 


.805 

.744 

8 

.00 


.969 

913 

.863 

.790 


05 


.865 

.767 

.620 



16 

.90 


.903 

.796 

.690 

.600 

.615 

,96 

.866 

.74 

,64 

.64 



32 

.99 


79 

.63 




.96 


.68 





64 

.99 

.79 






128 

.99 

.68 

1 






b, x^-disinbuiion. If a random sample of size n is drawn from N{p, a con- 
fidence interval for <r“ with coefficient e is given in Table II. If the n values form 
a correlated “sample”, it follows from Table I that the corresponding con- 
fidence interval with coefficient e is 

0 g T* g SVx.(l - p). 

The confidence interval in Table II can be rewritten as 

, 0 g g sVxUi - p), 

where 

x“a = xVa - p). 

Hence if p < 0, a > « and the significance level of the significance test given in 
Table II is less than 1 — If p > 0, the significance level of the test is greater 
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than 1 - 6 The effect of positive values of p upon the confidence coefficient 
a = a^i(p, n) of tlie confidence interval listed in Table II is given in Table IV 
for e = .95 and .99. Cases in which the lower limit of the confidence interval 
is not zero can be treated in a similar manner. Table IV .shows that the con- 
fidence coefficient a = ’0 decreasses with the sample size, n for a fixed value, 

of p. Although the effect of correlation for the x^-distribution is not as gi'cat as 
for the Student f-distribution, it does cause a noticeable change in a. For 
example, for samples of .size 10 the significance level of the test in Table II is 
changed from .05 to .081 if p = .1 and from .05 to .13 if p = .2. For samiffos of 
size 32 the significance level i.s changed from .05 to .10 for p = .1 and from .05 to 
.19 for p = .2. 

c. Snedecor f-disiribulion. If two random samples, one of size n (denoted 
by k’s) and the other of size m (denoted by y’s), are drawn from jV(ai, o^) 
and Nip,', a'^) respectively, a confidence interval for with coefficient « 


TAULK IV 
Values of otx’fpi a) 



0 

.1 

2 

3 

■i 


n 







i 

.90 

.988 

.986 

983 

.979 

.971 

.95 

.941 

.930 

.918 

.900 

,872 

IG 

.99 

982 

900 

.941 

,800 

.700 

.95 

919 

S7 


07 

.49 

32 

99 


946 


,716 

.44 

96 

■■ 

.81 


.38 

.17 


is^given in Table IT. If the values form two correlated “samples”, however, 
it follows from Table I that the corresponding confidonee. interval with coeffici- 
ent e is 


0 ^ 


£;^(1 - pQ 
- S'Kl - p) 


/ 




The confidence interval in Table II can be restated as 


where 


0 g 




S\1 - pQ /p 
S'*(i “ p)/ 


F. = F.(l - p')/(l - p). 

Thus if p = p', a = e and the significance level of the significance test given in 
Table II remains equal to 1 - If (1 - p')/(l - p) < 1, a > e and the 
significance level is less than 1 — e. If (1 — p')/(l — p) > 1, however, a < e 
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and the signiOcance level is greater than 1 - e. Values of the confidence 

coefficient a = aj, Q ^ ^ , n, ??i^ of the confidence interval listed in Table II are 

given in Table V for « = 95 and .99. Cases in which the lower limit of the 
confidence interval is not zero can be treated in a manner similar to that given 
above. Table V indicates that the effect of correlation on the confidence 
coefficient is not as great for n < m as for n> m. For example, if n = 4, m = 32, 



f 

— 1.25, the significance level of the significance test given in Table II is 

I — p 

only changed from .06 to .069, if — — =1.5 from .05 to .087. If = 32, m = 4, 
.1 

= 1.25, however, the significance level is changed from .05 to .094, if 

1 p 

i = 1.5 from .05 to .142. Also it is seen that for fixed ^ , the effect of 

I- p 1 - p 

intraclass correlation increases with both n and m. 
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4. Analysis. This section eontains derivations of the relations stated in the 
first three sections. The method used in these derivations is similar to that used 
in one approach to the analysis of variance and consists essentially in expressing 
each variable as the sum of two quantities, one of which is the same for each 
variable and the other of which is different for each variable. 

Let Ki , ■ ■ • , Xn represent a correlated “sample”, that is, have a normal 
multivariate distribution for which 

Bixd = M, 

(0 SKx. - m)'] = 

El(xi — )i){xj — m)] = po, 

Write the »< , (i = 1, ■ ■ • , n), in the form 

211 = , 
tl 

where ? — 2 fv/n and tj, , • • • , are independently distributed, according to 

N{n, al) and the according to N{0, vj). The values of X, a\ and vf are chosen 
so that the a, = ij -t- X| + satisfy (1). It is easily proved that it is always 
possible to choose X, a\ and a] so that ai'e satisfied, It is to bo remembered 
that p ^ - l/(n — 1) for intraclass correlation. From relations (1) and 
a:,' = ») + XS + fi it follows that 

(2) E{fi) as (r*(l — p), (f =a !,•••, n). 

1 " 

Theorem 1. The qmnlily ~ has a x-dislribulion with 

n — 1 degrees of freedom and is distributed independently of £. 

Proof. Since the are independently distributed according to the same 
normal disti'ibution with zei’o mean, it follows from (2) that 

~ "" 2^0^) ? ~ 

has a x^-distribution with n — 1 degrees of freedom and is distributed inde- 
pendently of S = 17 + (1 + \)l. 


(i = 1, • • ‘ , a) 
{i^ j ^ 1, > • • ,n). 


Theorem 2. 
tnhution with n 


a ~ p)V njn - 1) , 
v^l -H (a — l)p • 
— 1 degrees of freedom. 


V has a Student l-dis- 

1 1 - P 


Proof. If is easily seen from elementary considerations that 

PV i -H (a l)p 

has the distribution N{0, 1). Theorem 2 is then an immediate consequence of 
Theorem 1 


Up to this point a single correlated “sample” of size n has been considered. 
The next psnt of the analysis, however, will be concerned ivith properties which 
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Let Xi , ■ • ■ , Xn , y\ , ■ • ■ tUm have a joint normal multivariate distribution 
such that 


(3) 


E(y») = m', 

— ju)’*] = 

B[iy. - m ')“] = 

E[{Xi — tx){xj — (t)] = pa-^, 

E[(y. - p.'){vp - mO] = P'v'^ 
E[{x, — p){ya — /)] = p'Vo-'. 


(i = 1 , ■ • • ,n) 
(a = 1 , • • • , m) 


{i j = 1, ■ -,71) 
(a ^ 13 = 1, ■ ■■ ,m) 


Write the x, and ya in the form 

ic, = v + Xif + XjJ' + 

(4) 

ya = v' Xll + X2f' + fa ) 
m 

where ^ fl/m and t), i?', fi , • • • , fn , f i , • • • , f™ are independently 

1 

distributed, rj according to N(y, v“), rj' according to N{n', O, the f, according to 
iV(0, ffj), and the fl according to NiO, The quantities Xi, Xj, x( , X 2 , 
ff* , (tJ* , are chosen so that the a;,- and ya satisfy (3). It is easily verified 

that it is always possible to choose these quantities so that the Xi and ya con- 
structed in this fashion satisfy (3). In addition it follows from (3) and (4) that 


-B(d)=v(l-p) 
s(f:=) = v'*(i - p'). 

1 ” 1 

Theorem 3. - 271 ; c 2^ (x; — a;)* and 77 (f/o — 2 ?)^ ^“*'6 x’’- 

V U — pf 1 a- (1 — p j 1 

distributions with n — 1 and m ~ 1 degrees of freedom respectively, and are dis- 
tributed independently of each other and of x and y. 

1 " 

Proof. From Theorem 1 and ( 6 ) it follows that “277 ^ S 

V (.1 — pj 1 

1 “ 

and 7 - S (?/« — <?)““ have x^-distributions with n — 1 and m ~ I degrees 

<r — p ) 1 

of freedom respectively. That they ai'e distributed independently of each other 
and of both S and ^ follows from (4) , 


Theorem 4. 


- p') D ixt - «)* 

1 

v“(l — P) 53 (Va — f/)* 


is distributed according to the Snedecor 


F -distribution hn-i , m-i{F)dF. 

Proof. This follows from Theorem 3. 



96 


aOHN E, W-VIiSTt 


Theorem 5 

[{*-§)- Oi- fiOlVrt + w - 2 / /s(^i - s)* , 

^ ^ Ui ■- ,) " :-(ri7) 

where 

erf = ~ [1 + (u - l)pj + ^ II + (m - Dp'] - 2p>W, 

has a Student t-disirihuiion with n + w — 2 degrees of freedom. 

Troop. It is easily seen from elementai'y considerations that ^ [(i — y )~ 

(p - p')] has the distribution N(Q, 1) . Theorem 5 then follows from Theorem 3. 

The author wishes to express his appreciation to Professor John W. Tukey for 
valuable assistance and advice in the preparation of this paper. 



ON FAMILIES OF ADMISSIBLE TESTS 

By E. L. Lehmann 
University of California, Berkeley 

1. Svimmary. For each hypothesis i! of a certain class of simple hypotheses, a 
family F of tests is determined such that 

(a) given any tost w of II there exists a test lo' belonging to F which has power 
uniformly greater than or equal to that of w. 

(b) no member of F has poivcr uniformly greater than or equal to that of any 
other member of F 

The effect on F of various assumptions about the set of alternatives are con- 
sidered. As an application an optimum property of the known type Ai tests is 
proved, and a result is obtained concerning the most stringent tests of the 
hypotheses considered. 

2. Introduction. In the theory of testing simple hypotheses, if a uniformly 
most powerful best exists, it is the most desirable test to use If, as is generally 
the case, such a test does not exist, the choice between tests none of which is 
"altogether better” than all the others, has to be based on information not con- 
tained in the general formulation of the testing problem. If no such additional 
information is avaUablo, the choice must of necessity be somewhat arbitrary. 

Now although a single uniformly most powerful test exists only in exceptional 
cases, there will always exist a family F of tests such that 

(a) given any test w of the hypotheses H under consideration and of prescnlied 
level of significance, there exists a test w' belonging to P which has power 
uniformly greater than or equal to that of w 

(b) no member of F has power uniformly greater than or equal to, that of any 
other member of F 

The family F is essentially unique. Arbitrariness occurs only since a test region 
IS not uniquely determined by its power function. But since two tests with the 
same power function are equivalent for testing purposes, it is from the pi esent 
point of view immaterial which one is included in F. 

With the same restriction F is essentially the family of admissible tests, a 
test w being admissible if there is no test of the same level of significance which 
has power uniformly gi'eater than or equal to but not identically equal to that of 
w. This definition differs only trivially from the one given by Wald [1, p. 16] 
who defines a test w to lie non-admissible if there exists a w' with power every- 
where greater than that of w (except at the hypothetical point) , 

F naturally depends on the class of alternatives considered. A resti'iction 
in the class of alternatives may (although it will not necessarily) dimmish P. 
The family F may also be decreased by other additional information: For 
instance a probability distribution may be assumed for the set of alternatives, 
and some properties of this distribution may be presupposed. 
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The determination of the family F, (and a description of the power functions 
of the tests in F) might be considered a solution of the testing problem. The 
solution is not unique and hence does not ]jrovide a basis for action. This 
reflects the fact that additional information is needed to make possible the 
unique choice of a best test- On the basis of the available information, F repre- 
sents the furthest reduction of the problem that seom-s possible. On the one 
hand, if the choice of test is to be made from the point of view of power, the only 
contestants for "best test” are the members of F. On the other hand, the 
available information docs not give preference to any one mmnber of F over any 
other unless additional jirinciples (such as iml)ia8odn(*s.s for in.stanc.e) are 
introduced. 

It is the purpose of the present paper to illustrate the aliove notions by deter- 
mining F for a very simple case. 


3. Determination of the family F. Let the random variable 


E= (X.,Xa,-*.;X„) 
have a probability density function 

( 1 ) Pt 

depending on parameter 6. Concerning (1) we shall make the assumptions 
under which Ncyman [2, 3] has sliown the existence of the type Ai test of the 
hypothesis 

(2) II: Oo. 


Assumptions : 

(o) Condikons ojf regularity; 

The integral 

( 3 ) f piie) de 

J^a 


c == (ti , • • • , a:„) 
dc = dxt • • • dXn 


extended over any region w in the sample space, admits of two successive deriva- 
tives with respect to 6 under the integral sign, i.e. 


^ L 2 - 

(5) A differential equation: 

If 


( 5 ) 


veie) = - log p,(e) 
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tpsg is not identically zero, and there exist functions of 0 (but independent of e), 
A and B, such that 

( 6 ) cpe = A Bipo 

Under these assumptions Neyman has shown 

A. that the probability density function ps is of the form 

(7) pM = exp (P(0) + ne)-Qid) + Rie)} 


where Q is a monotone function with — Q{6) 

do 


7^ 0 (without loss of generality 


we shall assume Q monotoncly increasing) and 
B. that the type Ai test of the hypothesis H exists, and is given by 


( 8 ) 


T{e) < Cl , T{e) > 


for suitable choice of Ci and Ci . 

In what follows we shall assume that the permissible first kind error in testing 
H is fixed throughout and has the value «. By a test w of P we shall always 
mean a test of level of significance t, i.e. satisfying 


(9) 



E. 


Let us consider the family of tests 
(10) w(k ) ; T{6) < k, T{e) > f{k ) ; k < /(/c) 


where /(/c) is determined by (9). It easily follows from (9) that k can take on 
all values from — oo to /cq , say, where h is such that 

( 11 ) fih) = -!-<». 


For the family F of tests { w(A:) } , — «> < k < koVfe now state 
Theorem 1. All members of F are admissible, and if w is any admissible lest 
not in F, there exists a member of F which has power identical with that of w. 

We first prove the 

Lemma. Let /3„ denote the powerfuncLion of a test w. Then if ki < kj 


( 12 ) 


PwO-i'i ( 0 ) ( 0 ) 

Pwiki) ( 0 ) > ( 0 ) 

Proof; Let w denote the complement of a region w. 


if 8 < eo 
if 8 > 8a. 

Consider the intervals 


(13) 


I = w{kf) • v){h) 
J = to(/ci) • 


1 lies entirely to the right of J. Let 8 > So . Then 
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(14) ^ = m exp {rwiew ~ Q(Oo)l! 

'P6q\^') 

is a strictly increasing function of T since Q is increasing. Therefore there exists a 
constant 0 such that 



if 

T(e) is in ./ 

(15) 

if 

P»o(c) 

ir(c) is in /. 

Since 



(16) 

1 p,,(e) de = 

I pa,(e) de 



we have 



(17) 

/ p)^(c) de = 

/ p,,{e) dc 

JTU)tJ 


and therefore 


(18) p,(e) <Je < C • pi,(e) de = C • p,„(e) de < p,(e) de 

from which it follows that 

(19) f Paic) de < f p,{e) de 
which is the desired result. 

Proof of Theorem 1. The proof consists of several parts. 

I. Let m he any real number, and assume that there exists a value of fc such 
that 

(20) d»(«o) = € 

(21) ~ P.{8) = m 

for V) = w{k) . Then w(fc) has power uniformly greater than or equal to that oi 
any other test satisfying (20) and (21) . 

Tor m = 0 this becomes Neyman’s theorem stating that the type A tost ii 
also of type Ai , The proof of the theorem however is independent of the value o: 

(23) I fiUB) 

and hence carries over to arbiti'ary m. 

II. If there exists any test satisfying (20) and (21) then there exists a numbe 
fc for which w(fc) also satisfies (20) and (21). 
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To prove this let us determine, of all tests satisfying (20), the one which 
maximizes 

(24) ~ l«-»o de 

This can bo clone by means of the lemma of Neyman and Pearson [4, p. 11] 
which gives sufficient conditions for a region w, subject to restrictions 

(25) f /.(e) de = ac, (z = 1, • • • , p), 

«/ «» 

to maximize an integral 

(26) f g(e) de. 

According to this lemma the desired test is of the form 

(25) jgP<i(e)\s^o> a-pe^ie) 

provided a value of a exists for which this test satisfies (20) . (25) is equivalent to 

(26) P'(5o) + T(e) ■ Q'{e,) > a from (7) 

or, since <3'(6o) > 0, to 

(27) T{c) > h. 

Thus, if a number b exists such that the test (27) satisfies (20) , this test is the one 
maximizing (24) . But such a number does exist, namely /( — <» ) . Therefore 
w(— w) is the desired test. 

Similarly it is easy to show that of all tests satisfying (20) , w(/co) minimizes (24) . 
But 

(28) ^ = / — pe{e) |j_s„ dc 

ad JT<,k,T^fW ad 

is a continuous function of /c, and therefore takes on all intermediate A'-alues, 
which establishes II. 

Ill Fi’om I and II. ive conclude that given any test w there exists a member 
of F which has power uniformly greater than or equal to that of lo For let w 
be any test of H. From the condition of regularity it follows that its power- 
function has a derivative at 0o • By II. there exists a value of h such that the 
powerfunction of w{k) has the same slope at So , ami from I. it follows that 
w{h) is uniformly more powerful than w. 

But from the lemma we see that none of the tests xt>(A;) is uniformly more 
poAverful than any other Hence all members of F a, re admissible, and the 
theorem is proved. 

From the lemma and Theorem 1 Ave can conclude for all members of F the 
folloAving optimum property: 
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CoHOLLARY 1: Let w be any lest, and lei Wu he any member 0 / F. Then at least 
one of the two siaiemenis 


(29) 


< /3„.(0) 


for all 6 < Ba 
for all B > Bo 


must hold. 

The lemma and Theorem 1 also give the following result concerning most 
stringent tests, defined by Wald [1, p. 33]. 

Corollary 2: There exists a unifoinnly most powerful of all most stringent tests. 
It is that unique member Wo of If for which 


i.u.b, ri.u.b. M = 1-u.b. n.u.b. 

9<«0 L “ J S>«|, L “ J 


i. The effect on F of assumptions about the alternatives. Let us next consider 
how a restriction in the set of alternatives effects the family F. From the lemma 
it follows that there is no change as long as the set of alternatives contains 
values of B both greater and less than flo - On the other hand, if the alternatives 
are restricted to values of B greater than Bo , say, the family F for testing H 
against these alternatives consists of only a single member, the test vi{~ co), 
(and similarly for the other onesided case) . This follows from 
Theorem 2: Under conditions a. and b, Ike. test w(— k) is uniformly moat 
powerful against the alternatives 6 > Bo , the test w(ko) is uniformly most powerful 
against the aliematives B < Bo . 

Proof: Let w be any test. By Theorem 1 there exists a number k such that 

(30) Pu,{B) < for all 6. 

From the lemma it follows that 

, ^ 5 ^vnii„)(B) it & < Bo 

(31) 

PuiwiB) < pw(-n)(B) if B > Bo’ 

Combining (30) and (31) we have the desired result. 

(It is also easy to prove Theorem 2 directly from the Neyman-Peorson lemma.) 
In order to illustrate how the assumption of an a priori distribution of 6 
together with some information about this distribution affects F, let us consider a 
special case of the class of hypotheses discussed so far. 

Let 

(32) Poixi , ,Xn) = c • 

so that E — (Xi , Z 2 , • ■ ' , Zfl) is a sample from a normal distribution with unit 
variance and unknown mean. Wo want to test the hypothesis 

(33) IF. B = 0. 

We shall show that if B has a probability density function g which, is symmetric 



r-AMIUES OP TESTS 


103 


about the origin, then the family F for testing H consists, as might be expected, 
of a single member, the type Ai test. 

Our problem is to find the test w satisfying 

(34) / po(a;i , • , a;„) dxi dx^ = b 

and which maximizes 


(35) 


[ g{s) [ ps(xi, 

J—oo J m 


• ■ , x„) dxi • ■ ■ dXn do. 


Inverting the order of integration, which is permissible in this case, the Ney- 
man-Pearson lemma shows the desired test to be of the form 

(36) J 0(S)Pa(^i, ■■■ ,x„)de > a-po(Xi, , x„) 

provided a value of a exists for which (36) satisfies (34). Substituting from 
(32), (36) becomes 

(37) f(x) = r > a 

v—QO 

wlicrc 

(38) 

n .-1 

Since 

(39) “2/(x) > 0 


the region (37) is either empty, which would contradict (34), or else can be 
described by inequalities 

(40) X < ai , X > a'l 
where 

(41) f{ai) = f{ai) 

the latter equation becoming, on. substitution from (37) 

(42) f ^ p(0)e-“‘"'®”'(e"“‘'' - dB = 0. 

•J-V3 ^ 

If g is an even function, (42) is certainly satisfied when ai = — as . Our test 
then becomes 

(43) X < —as , x> 02 

which for proper choice of as satisfies (34) and is the well known type At test. 
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6. Concluding remarks. Lot us eonsidor unco luoro a probability tlonsity 
function satisfying a and b. We have seen that the family F for testing II 
against the alternatives 6 Ba contains an inliiiity of dements unless we make 
some additional assumptions. On tlie other hand, if the principle of unbiased- 
ness is accepted, F shrinks to a single element; the typo .-U test. 

But unbiasedne.ss does not insure power. Thus conceivably some other test 
might be moie powerful than the tost chosen, everywhere except in a small one 
sided neighbourhood of Bo . That tills is not so is shown by Corollary 1 to 
Theorem 1. This remark illustrates how intuitively appealing principles and a 
knowledge of the family F may be used in conjunction to arrive at a choice of 
a satisfactory test, when not enough information is available to make the choice 
compelling. 

Finally, it should be pointed out that although we restricted our conaideralions 
to simple hypotheses, the notions developed also a])ply to composite hypotheses. 
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CONDITIONAL EXPECTATION AND UNBIASED SEQUENTIAL 

ESTIMATION^ 

By David Blackwell 
Howard University 

1. Summary. It is shown that Hlf{x) E{y | a;)] = Eijy) whenever Eijy) 
is finite, and that a^E{y | x) < a\j, where E{y \ s) denotes the conditional ex- 
pectation of y with respect to x. These results imply that whenever there is a 
sufficient statistic u and an unbiased estimate t, not a function of u only, for a 
parameter fl, the function E(t | u), which is a function of u only, is an unbiased 
estimate for 6 with a variance smaller than that of t. A sequential unbiased 
estimate for a parameter is obtained, such that when the sequential test termi- 
nates after i observations, the estimate is a function of a sufficient statistic for the 
parameter with respect to these observations A special case of this estimate is 
that obtained by Girshick, Mosteller, and Savage [4] for the parameter of a 
binomial distribution 

2. Conditional expectation. Denote by x any (not necessarily iiumerical) 
chance variable and by y any numerical chance variable for which E{y) is finite. 
There exists a function of x, the conditional expectation of y with respect to x 
[3, pp 95-101, 5, pp 41-44] which wc denote, as usual, by E{y | x) and which is 
uniquely defined except for events of zero probability, such that whenever /(x) 
is the characteristic function of an event F depending only on x (i.e. / = 1 when 
F occurs and / = 0 when F does not occur), the equation 

(1) E\j{x)Ei3J I x)] = E{fix)y] 

holds. Now if fix) is a simple function, i.e. a finite linear combination of char- 
acteristic functions, it is clear from the linearity of expectation that (1) continues 
to hold Quite generally, wo shall prove 

Theorem 1; The equation (1) holds for every function f{x) for which Elf(x)y] 
^s finite 

To simplify notation, we write Eiz \ x) = ExZ foi any chance voidable z. The 
following corollai'y to Theorem 1 asserts simply that the operations E^ and 
multiplication byfix) are commutative This fact, which is trivially equivalent 
to Theorem 1, has been stated by Kolmogoroff [5, p. 50], 

Corollary: If E[f{x)y] is finite, then Ex[f{x)y] = f{x)Ety. 

Proof of Corollary; If g{x) is a characteristic function, then E{gfExy) = 
EJigfy) by Theorem 1. Since B:,(fy) is unique, the Corollary follows. 

Proof of Theorem 1; Since Theoiem 1 holds when/(x) is a simple function 
and the product of a simple function and a characteristic function is a simple 
function, the Corollary holds when /(a:) is a simple function. 

I The author is indebted to M A. Girshick for suggesting the problem which led to this 
paper and for many helpful discussions. 
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Now let fix) be any function for wHch E(fy) is finite. There is a secfuence of 
simple functions /„(a;) anch that/„(a*) ->/(r) and l/„(.r) | < j/(.r) |. For instance 
we may define /„(a') = m/ii when m/n < fix) < ivt + iV'^b 0 < m < n,f„ix) 
= m/n when (m - l)/n < fix) < m/n, 0 > m > -n\f„ix) = 0 otherwise. 

We recall the following proposition of Dooh (2, p. 206]: 

(2) I EsV I < A'x 1 2/ I 

with probability one. Then, using the Corollary (for simple funrtions) and 

(2) , we have ] fnExU 1 = 1 ExifnV) 1 ^ I f«y I 1 /l/ 1 • 

(3) Eifjixy) = Eifui/). 

Since the two sequences of functions are bouncled in absolute value by 
the summable functions Et \ fy \ , \ fy \ , T^cbesgue's theorem [8, p. 29] applied 
to (3) yield, s (1) 

In section 3 we shall n.se the fact that if it is a audieient statistic for a parameter 
0 and /is any unbiased estimate for fl, then liif ] n) (which, since u is a aullieient 
statistic, is a function of w independent of 0) i.san nii))ia.-<cil e.stimate for 0. This is 
obvious, since it follows from the definition of condifional expectation that the 
two chance variable,s/ and Bif | u) have the same expected \-alue. 'I'he interest- 
ing fact is that the estimate 2i'(/ j it) i.s always a lietter estimate, for than / in the 
sense of having a .smaller variance, unless f i.s already a function of u only, in 
which case the two estimates / and Eif ] w) clearly coincide. U’his is simply the 
fact that the variance of the regression function of/ on u is not greater than the 
variance of /. In the case of Gaussian vai-iables, where the regi'es.sion is linear, 
this fact has been noted by Doob [1 , p 231].^ Our statement is embodied in 

Theorem 2: If ay is finite, so is a^ExV, and aEty < a^y, with equalily holding 
only if ExV = y with probability one. 

Proof: Denote by m the common expected value of y and E^y, Suppo,sc for 
the moment that aExV is finite. By the Schwarz inequality EipExy] is then 
finite. Then ay = Eiy - mf = E[iy — Exv) + iExV — m)f == Eiy — BxyY 
-b a^Ex/y, since — m)] = E\yiE,y — m)] by Theorem 1, Thus a\ 

exceeds a^Exy by E[y — ExvT, which is positive unless y = ExV, i.c. y is a func- 
tion of X. Thus we obtain the usual decomposition: the variance of y is the 
variance of the regression of y on x plus the variance of y about the regression of 
y on X. 

To show that a^ExV is finite, we require the following 

Lemma (Schwarz inequality): If Bif) and Ef) arc finite, then, with 
probaUlUy one, 

Elifg) < Bxf)Exf). 

A proof can be constructed on the usual lines by considering the function 
Qix, \) = Exif + hg)^. There are, however, certain measure-theoretic difficulties 

“ For funotionB o£ finite variance it is possible to interpret conditional expectation as a 
projection in Hilbert space, when the statement becomes simply the Bessel inequality 
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in handling simultaneously the conditional expectations of the family of chance 
variables (/ + instead we shall give a simple direct proof based on the 
ordinary Schwarz inequality for integrals. 

We may suppose / > 0, (/ > 0 with probability one, since, from (2), 

Elm <E^s(\f\\g\) 

with probability one. Unless the Lemma holds there are three positive numbers 
a,l), c with a > be for which the event 

{E,fg > a\ < b, E^) < c} = 

has positive probability Then denoting by h the characteristic function of H 
and using the Schwaiz inequality for integrals, we have 

ar\H) < E\hE.m\ = E\hfg) < E{hf)E{hg^) 

= E[hE.{f)]E[hEm)] < bcP\lI), 

which is impossible. This completes the proof of the Lemma. 

The Lemma, with / = y, fif = 1, yields El(y) < Ex{y^) with probability one, 
which implies the finitencss of aExy and hence completes the ]iroof of Theorem 2. 

3. Unbiased sequential estimation. Consider a chance variable z whose 
distribution depends on a parameter 0. If we have an unlnnsed estimate i(,z) 
and a sufficient statistic u(s) (not necessaiily a single numerical chance variable) 
for 0, then, as mentioned in section 2, v(u) = | w) is an unbiased estimate for 8 
depending only on u.^ We have shown that the variance of v i.s never greater 
than that of i, and we shall see that it is sometime.? much smaller (see example II 
at the end of this section) The estimate obtained in this section for the pai’am- 
eter of a sequential process is of the v type; its importance lies in the fact that 
in many cases there is an unbiased estimate t (generally poor) which is a function 
of the first observation, and which will consequently he an unbiased estimate no 
matter what sequential test procedure is used. 

Let xi, X 2 , ■ ■ be a sequence of chance variables udiose joint distribution is 
determined by an unknown point in a parameter .space. A sequential sample 
(test) [9] is determined by specifying a sequence of mutually exclusive events 
Si , S 2 , where iS, depends only on * 1 , • • - , . 1 :; and 

(4) f) P(S,) = 1 for all 0. 

1-1 

The event Si is that sampling stops after the ith observation, and (4) ensures that 
sampling stops eventually. Thus if we define the chance variable n — i when Si 

occurs, n is the size of the sample. 

1 

’ It was pointed out by the referee that, strictly speaking, u does not have to be sufEoiont; 
it 18 necessary only that v (u) be independent of 9. The author is indebted to the referee for 
many valuable suggestions. 
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Denote hy Ui , U 2 , ■ ■ ■ any sequence of chance variables such that Uy = 
u,{xi , • • • , .-cj) is a sufficient statistic for estimating B from xi , • > ■ ,xt. There 
will of course be many such sequences {■Mi}, but it often happens that there is 
one which mises in a natural way from the sequential process; if we are sampling 
from a binomial population, for instance, = number of defectives in the ffist i 
observations is a sufficient statistic. We shall suppose that the sequential test 
satisfies the following condition 

(6) S; = WiC(Si+ ••• + Sw),' 

where Wi is an event depending on u, only. This condition means that when 
the fth observation is taken, the decision to stop at this point depends only on 
the fth sufficient statistic w, . For the binomial example mentioned above, tliis 
means that the decision to stop after i observations depends only on the number 
of defectives observed at that stage, imd not on the order in which they were 
observed. The Neymaii criterion for u, to bo a sufficient statistic [7, 10, p. 136] 
shows that (6) is no restriction whatever for the sequential probaliiliiy ratio 
test [9] since the ratio in terms of which the test is defined will be a function of 
ut only. 

Let ii j fa , • ■ ■ be any sequonoe of chance variables such that /, is a function of 
% I ■ • • ) a:, j define I = f, when Si occurs. If EQ) — 0, f is said to be an unbiased 
estimate for 6 (relative to the particular sequential test {(S'l}). The theory of 
sequential sampling lias been formulated primarily for testing hypotheses, a 
problem which ai'isca naturally and often is the following; After a sequential 
sample has been obtained, is there an unbiased estimate for 0? Hinec a sample 
of constant size is a special case of a sequentially Bolccte,cl sample, we cannot 
hope to find unbiased estimates for arbitrary sequential samples unless such 
estimates exist for samples of every constant size. This is eipiivalent to the 
existence of a function t{xi) for which E{1) — 0 for all 0. Our problem is to 
discover an unbiased estimate for 0 which, when n = f, is a function of ui alone. 
Such an estimate has been found by Girshick, Mostellor, and Savage [4] for 
sequential samples from a binomial population. It turns out that whenever 
there is any unbiased estimate at all for a particular sequential test, thei-e is 
also one of the type described. Thus, if there is an unbiased estimate I for 
samples of fixed size N, there will be an unbiased estimate of the type described 
for every sequential test requiring at least N ol^servations, since t is itself an 
unbiased estimate for such sequential tests. 

Denote by L any unbiased estimate for 0 relative to a particular sequential 
tost {(Sj]’ Denote by la, , hi the characteristic functions of the events Wt, 
C(Si + • • ■ + )S,) respectively, and define u = ri, ,« == EQii^ih ] | ui) 

when n = i. To justify the definition of v we remark that the event [ft = i,' 
EQii-i I ui) = 0] has probability zero, since 3/1, _i ^ with proliability one, 
where q is the characteristic function of the event {E{hy-i j m) > 0}, while 


For any event A , C (A) denotes the event that A docs nt]t oceiiv 
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^(2^.-1) = EiqBQi^^i I M,)] = I w,)] = E(hi-d- 

Since u^ is a sufficient statistic for 8 with respect to a;i , ■ • • , aij , y is a function of 
u and n only, independent of 0. The main result of this section is 
Theoeem 3. v IS an unbiased estimate for 6. 

Proof: We shall show that v = E{t\u,n). This not only shows that v is an 
unbiased estimate for 0, but also interprets y in a very simple way and, as men- 
tioned above, implies that the variance of v does not exceed that of t. It must 
be verified that for every event D depending only on n and u, E{dv) = E{dt), 

CO 

where d is the chai’acteristic function of D. NowD = DSi, and DS{ = 

1-1 

where Di is an event depending only on -u, . It is sufficient, then, to show 
E{d^Wih^-.lv) = E{diWihi^it) , where is the characteristic function of D^ . Now 

E{dfljoJit-iv) = E[d,wX-iEih^iti ] u,)/E0ii-i | w,)], 

using the definition of v. The function in brackets is multiplied by a function 
of Ml ; by Theorem 1 its expectation is unaltered if hi-i is replaced by E(hi-i 
Thus the right member of the last equality equals 

E[dflViE{hi-iU 1 Wi)] = Eid^iodh-iU) = E{dtW,h,-i{) . 

We conclude with two examples: 

I. Binomial and Poisson distributions. Suppose a:i , as , • • are inde- 
pendent with identical distributions, either binomial or Poisson, with parameter 
0. Then t ~ Xi{— U for all i) is an unbiased estimate for 0, and it is well known 
that M. = a!i + ’ ■ • -h is a sufficient statistic for estimating 0 from a;i , • • • , .-Ci . 
For any sequential test satisfying (6) our unbiased estimate for 0 will be 

^ _ E(Jii-iXi\ui = u) _ E{h,^iXif) 

E(hi-i\u, = u) E{K-if) 

when n = i, Ui = u, where / is the characteristic function of the event M{ = u. 
Then 

eo 

i) 

V = 

00 

X) kj{u, i) 
ki(u, i) 

V = 1 

£ i) 

j-o 

where k,{u, i) denotes the number of possible sequences ai , • • ■ , .Ti for which 
n>i,xi-\-----\-x^ = u, and Xi = j For the binomial case, this is the estimate 
found m [4] 

II. Samples oe constant size. We consider the special case where a 


for Poisson 


for binomial 
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sample of constant size N is selected, Xi , ■ ■ - ,Xn ai’c independent with identical 
distributions, and the density function for Xi has the form 

(7) vix, 6) = r(e)s(fl)'"*’5(a;) 


considered by Koopman [6]®. Suppose further that there is an unbiased estimate 
f(.'Ci) for 0. These conditions will be satisfied, for instance, if 0 is the mean of a 
binomial, Poisson, or normal distribution, with w^x) = tix) ~ x, Thenu^ 
= wixi) 4- ■ • ■ + w(a;v) is a sufficient statistic. Our estimate u becomes simply 
V = UfiCri) 1 Now E[l{xi) | Ua-] == • * ■ = E{t{xn) | Wyl, since ut, is a sym- 
metric function of aiL , ■ ■ , Eat , which arc independent with identical distribu- 
tions. Consequently 


V == E 


S 1 u,/ 


so that 



aH{x,)/N. 


N 

In the special case w{x) = i(x) ~ x, we have » = i.e, our estimate is 

yml 

simply the mean of the iV observations , • • • , . 
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' It has been shown by Koopman [6] that if there is a sufficient statistic satisfying cer- 
tain regularity condUions, the density function for x must bo of tho form (7) . 



THE DISTRIBUTION OF THE MEAN 
By E. L. Welkbe 
Umversity of Illinois 

1. Summary. Both population and sample mean distributions can be repre- 
sented or approximated by Pearson curves if the first four moments of the 
population are finite. Using the ok , 8 chart of Craig [2] to determine the Peai’son 
cm'vc type for the population, an analogous 03 , 6 chart is derived for the dis- 
tribution of the mean. This defines a one to one transformation of aa , 5 into 
al , 5. The properties of this transformation are used to discuss the approach 
to normality of the distribution of the mean as dictated by the central limit 
theorem. This is facilitated by superposing on the ck , 5 chart the al , 5 charts 
for samples of 2, 5, and 10. 

2. Introduction. For any given distribution function of a population, a 
method is available for finding the distribution function of the mean, when it 
exists, that depends on characteristic functions and the Fourier integral theorem. 
For example, characteristic functions have been used to show that the arithmetic 
means of samples from a normal population is normal, and, with minor restric- 
tions on non-normal populations, that it is as3mptotically normal. The method 
depends, of course, on a knowledge of the exact population distribution. 
Some authors have discussed the approximation of the distributions of sample 
means in special cases by one of the Pearson curves It is the purpose of this 
paper to consider the complete range of Pearson curves us populations to be 
sampled, then to give the sampling distributions of the mean as approximated 
by the Peai'son system, and to discuss the manner in which the distribution of 
the mean approaches the normal curve as dictated by the central limit theorem. 
Since the choice of a Pearson curve depends only on moment relationships, this 
will include the approximation of the distribution of the mean for any parent 
population as based on its moments. Both an algebraic and a graphic analysis 
will be given. 

3. Semivariant and moment relationships. Denote by «/, the kth. order 
moment of the population with zero mean and unit voiiance. Let 'Kh be the fcth 
order seminvariant of the population. Let ock and \k be the same pai'ameters 
of the distribution of x, the mean of a random sample of size N draivn from this 
parent population. Using properties of the seminvariants of linear functions of 
variables independent in the probability sense, formulas relating those param- 
eters [1] are 

h = XfcAf*"*, 

«a = Aj = OiN , 

a = [a* -h 3(iV - 1)]N~\ 
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and 
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4, The Pearson system of curves and the distribution of the mean. The 
determination of the Pearson curve will bo made m accordance with the schcine 
discussed by C. G, Craig [2]. In this system the curve type is fixed by the 
moment az and the constant 

2a!4 — S aa — 6 
^ + 3 



Fig, 1. The aj , S Chart for Poarson’s Curves 


The scheme for determining the type of curve is shown graphically in Pig. 1 in 
which the al , 5 plane is divided into areas in which the Ptjarson curve types are 
noted. The bounding al , 5 curves arc 

a = -1, 5 = 5 = 0, 5 = 1, al =* 0, 

al = 45(5 + 2), and (2 + 35)a8 = 4(1 4- 25)’‘(2 + 5). 

Let 5 denote the value of the 5 function for the distribution of the mean. Then 

- 2a4 — 3&a — 6 
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In terms of moments of the parent population 


2 ■ 


+ HN - !)■ 
N 


Q «3 


Oil + 3(iV — 1) 
N 


+ 3 


2a'4 — Sag — 6 

ai + 3 + GiN - 1) ■ 


We see that 5 = 5 for W = 1, and 5 < 5 for W > 1. Both S and al approach 
zero as N approaches infinity. These are the values of the constants for the 
normal function. This result is expected from the central limit theorem. 


6. The al, S diagram for varying sample size. For every given population 
with finite moments of orders 1 tlirough 4 there exists a Pearson curve represent- 
ing or approximating its distribution. This determines a point in the al , 5 
plane. For a given sample size, N, there corresponds a point in the al , 5 plane. 
If the point (iil , S) is noiv plotted on the al , 5 plane, we can determine the t 3 T)e 
of Pearson curve which is needed to approximate the distribution of x. The 
transformation of al , S into al , S enables us to analyze the relationship between 
population distributions and distributions of x. The transforms of the boundary 
curves in the al , S plane will constitute an al , 5 chart corresponding to the one 
for al , 5 shown in Fig 1 In studying the approach to normality of the dis- 
tribution of X, it is illuminating to superimpose this al , 5 chart on the al , 5 chai’t. 
In order to do this, it is necessary to make certain algebraic changes in the 
equations. 

First eliminate at from the formula for S as follows. From 


S 


2aj — 3al — 0 
a4 -b 3 ~ 


we find 


a,i = 


35 -f- Sag -|- G 

' 


Substitute this in the expression for 5. Then 


- _ 2a4 — 3al — Q _ 6(al -b 4) 

~ ai -b 3 -b 6(W - 1) ~ al-b 4 -b 2(N - 1)(2 - 5) ' 

This formula, in conjunction with 

al = Nal 


enables us to wite the transformations of the boundary curves 


Boundary Curve 
8 = -1 


8 


-I 


Transformed Curve 

, _ -(iVal + 4) 

' Wal -b 4 -b 6(W - 1) ■ 

-(iVal-b4) 

’ 2(iVai -b 4) + ]0(W - ])■ 


5 = 0 


8 = 0 . 
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2(Ar5g + 4) 

^ ® 5{N3.l + 4) + 1Q{N - 1) ■ 

£t3 = 4S(5 + 2) a3[A^5!a + '1 “h 25(i\r 1)] 

= 45 ( 0:3 4)[6(iV&3 + 8N — 4) + 2Nc^ 4" 8]. 

(2 + 3S)al = 4(1 + 25) (2 + 5) [5(16iV + 8Nal - 4) + 2Nal + 8] 

[Na, + 4 + 2S(Ar - DfN&l 
= 4[5(2JVaa + lOiV — 2) + Nal + 4][5(iV^5:a + 8N — 4) + 2]Sfal + 8], 


S 



Fig. 2 shows the chaa't for distributions of S for = 2 by dashed curves 
superimposed on the chart for the population sho^vn by the solid curves, and 
Fig. 3 consists of the same curves for N = 5 and Af = 10, The intervals on the 
population values are 0 < a? < 12 and — 1 < 5 < .4 in Fig. 2, but only part 
of the aj range is shown in Fig. 3. In each case the curves for the distribution 
of X covei' the interval for , 6 which corresponds to the entire interval shown 
for the population in Fig. 2. Population curves are identified by capital letters 
and the corresponding curves for the distribution of x by the corresponding lower 
case letters. 
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Before discussing the Pearson curve relationships disclosed by these graphs, 
let us analyze some of the geometric properties of the transformation itself. 
Let N be considered as the parameter defining families of curves in the 0:3 , S 
plane corresponding to a] = constant and 6 = constant, the systems of lines 
parallel to the coordinate axes. The transform of al = k is sl = k/N, a system 
of lines perpendicular to 5 = 0, and approaching a? = 0 with increasing N at 
the rate kN~^. The line aa = 0 is invariant under the transformation, but it is 
not pointwise invariant. 

5 K . s 5 H - 10 



Fig 3 The aj , S and , 5 Charts 


The transfoim of 5 = (7 is 

Ciml+i) _ C(al + 4:/N) 

mi + 4 + 2(JV - 1)(2 -O’ a| + [4 + 2 (N - 1)(2 - C)]N-^ ' 

Solving for as , this becomes 

.2 _ 40 - 5[4 + 2iN - 1)(2 - O] 

““ Nis - O 

Except for the straight line 5 = 0, obtained when (7 = 0, this is a system of 
rectangular hyperbolas with asymptotes 
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= -[4 + 2(JV - 1)(2 - COliV' and 5 = 0. 

We are concerned only ■with the range 5a > 0. Hence 
-[4 + 2(N - 1)(2 - C)]iW' 

must be positi've for the as5miptote to show on the diagram. Since | 5 | <2, 
and thus \ C \ < 2, the expression in brackets is necessarily positive. Hence 
the vertical asymptote is always outside the range of interest and will not show 
on the diagram. However the horizontal asymptotes, 5 = 0, do appear in all 
cases. The hyperbolas are concave downward if 0 > 0 and oi’e concave upward 
if 0 < 0. 

Lines of the pencil 5 = arc transformed into the hyperbolas 

- maliNm + 4) 

al- 2motliN - 1) + 4 

for Ai > 1, It IS clear that (0, O') is the only invariant point. Every point on 
5 = mal is transformed into a point closer to the origin, the square of the distance 
from the origin changing from 

(m° +1)03 to (m* -h l)a3jV’'^ 

It is easily verified that the hyperbolas are asymptotic to 

mNal {N - 1)(1 + 2m) .1 _ -4 

" “ 1 - 2m{N - 1) [1 - 2m(N - l)f ® 1 - 2m(W - 1) ’ 

As N approaches infinity, these asymptotes approacli 

_a 

5 = — “ and al — 0. 

An area in quadi’ant one (four) in the al , 5 plane is transformed into an area in 
quadrant one (four) in thd 53 , 5 plane. The transformed area is nearer the 
origin. 

6. Types of Pearson curves for distribution of sample means. Examination 
of the graphs in conjunction 'with the above described properties of the trans- 
formation shows the follo'wing facts regarding the distribution of means of 
samples drawn from populations identified by al and 6, First consider the 
normal function and the three mam Pearson types only. 

I Parent Population 

Normal 

h 

1 / 

IV 
VIi 

VE 


Dislnbulion of Sample Means 
Normal 
h 

E and E 
It/ , E and I/, 

IV 

VE and IV 
VE , VE and IV. 
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The transition types were disregarded completely in the above analysis. It ia 
worth noting that, disregarding type X, III is transformed into III, VII into 
VII, lie; into lit , never into lit, , V into IV, but never into V. Type X is 
transformed into type III, never into X. Others follow a similar pattern. 

These moment relationslups on the distribution of the mean are not sufficient 
conditions in general. In special cases they are, for example the normal dis- 
tribution and the type III (see [3]) . They do represent the best approximation 
curve as specified by the Pearson system. We know that m some cases, for 
example type II (see [3]), the distribution of means is not described by a Pearson 
curve. It is clear, however, that the approach to normality is indicated ana- 
lytically by the transformation aj , 5 to Sj , 5 and is shown graphically by the 
Ui , 3 diagram. Skewness and knrtosis in the parent population are reflected 
in the distiibution of the mean in small samples A symmetric distribution 
of the mean requires a symmetric parent population i;egardless of sample size, 
but the degi'ee of skewness decreases rapidly with an increasing numbei in the 
sample. The Pearson curve which approximates the distribution of from a 
bell-shaped parent population is also bell-shaped. The Pearson curve approxi- 
mating the distribution of x for samples of W = 10 (Fig. 3) is bell-shaped for 
any parent population with values of a 3 and 5 within the intervals eonsidcred. 
For samples of 5 in the same range the approximating curve is either bcll-shapt'd 
or J-shaped, but it is never U-shaped. For samples of 2, even the U-shaped 
distribution is possible, but only with extreme values of aa and 3. The point in 
the 0 : 3 , 3 plane corresponding to the normal curve is the only invariant point in 
the transformation Hence parent populations iviih parameters not satisfying 
aa = 5 = 0 cannot yield normal distributions of sample means. 
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NOTES 

This section is devoted to brief research and expasitory articles on methodology 
and other short items. 


ON THE STDDENTIZATION OF SEVERAL VARIANCES 


By B. L. Welch 
University of Leeds, England 


1. Introduction. In a recent paper [1] the author considered the problem 
of eliminating several variances simultaneously from probability statements 
concerning the mean of a normally distributed variable. The general situation 
envisaged was as follows. We supposed that we had an observed quantity y 
wbch could be assumed to be normally distributed about a population mean 

' h 

1 } with variance (T„ = S X.o-i , where the X, aim known positive numbers and the 

I cal 

unknown population variances. It was supposed further that the data 
provided estimates s’ of the <rl based on f, degrees of freedom, and having the 
sampling distributions 


( 1 ) 



and that these estimates were distributed independently of each other and of y. 
The problem was to make statements about the magnitude of the difference 
3 / - 1 ) which would involve explicitly only the observed variances s] , The 
probability of the truth of the statements was also to be entirely independent 
of the population values (r’ . 

The solution was given implicitly in a formal mathematical expression and a 
general process of developing successive terms in a series expansion was de- 
scribed. In the present communication a slightly different way of reaching this 
development is provided 

2. General method. If the/, are large enough the ratio 


(2) i; = 

can be taken to be normally distributed with mean zero and standard deviation 
unity. This suggests that, when the ft are not necessarily large, wo might 
approach the matter by seeking some other function 

(3) X = S[sl,sl, ,sl,y - y] 

which will still be normally distributed with the same mean and standard 
deviation. We shall see that such a function can be found, although the method 
to be followed leads us first to another expression 
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(4) 


y fj /i(si j S 2 j • • • , Sji , ai) 


which IS simply the transposed form of (3) . Once we have obtained h we can 
solve out from (4) to obtain a. 

Since the distribution of y is independent of we have 


(5) 


I - vm:?. {-* 


dy. 


Transforming therefore to the new variable x we have for given s? 

1 

p{x I sO dx = —7= 

( 6 ) 


= j{s^ a;, SX, 0-^1 da; (say). 


3 ^ /!,^(s^) \ 

dx 


The unrestiicted distribution of x is then obtained by averaging ovei the joint 
distribution of the s* . In order that x should be a unit normal deviate wc must 
therefore have 

(7) pW = f f n {P(s!) dsil = 

«2 


We have to substitute from (1) and (6) into (7) and then choose the function 
h(s\ x) in such a manner that the equation is satisfied whatever may lie the 
values of the unlcnown cr\ . To evaluate the function by the methods of numeri- 
cal integration is probably impracticable except perhaps m some simple special 
cases. A series development is, however, quite feasible. 

Symbolically wo can write 

(8) j {5 ^ a:. (SX.cr!)} = ;{ri, a;, 2fX,(7?} 


where a, denotes differentiation with respect to w, and subsequent equation to 
a-] . Equation (7) then integrates out to give 


(9) 


He- 




1 - 


2a! a, 

A 


-i/. 


jIw, X, ifX^a!} = 




i.e. 

(10) ©j{u), a:, SX^a!} = (say). 


The operator © must bo expanded in powers of di before it can bo interpreted 
When this is done we find 
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Our procedure now is to find successive approximations to h{s\ x). It will 
be convenient to denote by hris^, x) an expression which equals x) to terms 
of order l//[ , Further let Cr+i(s^, x) be a corrective term which when added on 
to hris^, x) will give a result correct to terms in Then to this order we 

shall have from (6) 


03) 


V2^j{w, X, oxp|-i 


, hl(w, a;)\ dhriw, x) 

J da: 


+ 


'n/sX.o'. 


exp-^ 


a:“(2X{W»)\ jacr+i(w, x) x(X\iW,)Cr+liVl, x) 




dx 


SX.ir? 


remembering that the leading term in h(w, x) is x^/ SX<u;,- . 
Hence from (10) we find 


(14) 


© 


1 

Vsxx 


exp 


1 hl{w, a))\ dhr(w, x) 

2Xi<r^, / dx 
r _^L_ „-lx’/dCr4-l((r®, ®) 

+ vij:x“ — 


:i:cr4-i(o' } x') I 


i.e. 


(15) 


f -j,J Cr+l(vS X) \ 

da I VsXiff? / 


+ O exp 


f ,h l(w, a;) \ 

I J 


1_ dhr(tU, X) 




Given /ir'we can therefore proceed directly to c,+i and hence to hr+i . 

3. Application to give terms in l/fi . It will bo sufficient illustration of the 
method, if we show here how to obtain hi from ho . We have from (15) 


(16) 

i.e. 

(17) 


A Cl(o^^3^) \ 

d4 V^'/ 



.r' {2Kw,) \ / (SX.tc.O ^ -1,. 
2 (SX,v?)/ r (2X.v“<) 


^ / -jiS Cl(<T^a;) 

dx\ VsX^i 




(sxkV/O 


£ exp 



a/u = 0 


where d now denotes differentiation with respect to u and subsequent equation 
to unity 


i e. 

(18) 

(19) 

whence 


d Ci(cr®, a:) \ _ (2X«r<//j) 1 ..i o™* -,■<1 

dT^r ^ 

— ^ (2XivV/t) L I- 3.^1 


(20) 


Ci(ff“, x) = X 



(1 + X^ ) (2XVV/.) 

4 (SXvv'O’ 
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Hence to the terms in 1//. we have 


( 21 ) 


y - ri = his^, x) 



(1 + a:') (SX'sV/J ~ 
4 (SX.s!)“ _ 


Solving this ont for x we obtain, to the same order 


( 22 ) 


X 


(1 + A (SXist//.) ~ 
4 (SX.s”)“ - 


where v eqn als (y — SX,s'i . To order 1 //, we may regaj d a; as a unit no rmal 
deviate and hence determine the probability level corresponding to the observed 
ratio V. On the other hand if we wish to determine the value of y — ij Avhich 
will he on a given percentage level the expression (21) is the appropriate one 
to use. 

4. Further discussion. The present development is of course basically 
equivalent to that given in the previous paper. Indeed if ive integrate (10) or 
(15) out Avith respect to x Ave arrive immediately at the formulae which Avere then 
obtained and which Avere illustrated by calculating terms to order 1/fl . In 
fact Avhen calculating higher order terms it seems best to do this integration 
before carrying out the operation 0. The object of the present note is really to 
stress the fact that Ave arc simply finding a function of the observations and of 
y — 1) AA'liich IS distributed as a unit normal deviate, Avhatever the values the 
true a\ may chance to possess 

Finally, the remarks folloAving equation (7) above should be someivhat ampli- 
fied. The equation asserts that the distribution of any arbitrai-y function x, 
defined by (3), is 


(23) 


I \vl V'2irSX,a^. 


exp - 2 


1 dhjs'^, x) 

SXiUi / dx 


JI fpCs*) ds“}, 


where h{s , x) is the function obtained by solving out (3) for y — y. On carrying 
out the integrations in (23) we shall m general obtain p{x) as a function of x and 
a\ . Our argument is that if h be chosen properly the a-\ will disappeai’ from 
p(,x), and X will appeal’ only in the form of the unit normal probability function. 

To find h{s , a;) by a direct process of numerical integration would appear to 
involve in the first instance the choice of a net-work of points for x and s\ . 
Suppose the range of x is covered by points and the range of s“ by n, points. 
We may then as an approximation look on our task as that of finding the (n*7r,n ,) 
values of h(s^, x) corresponding to this network. Since (23) is to be true for all x 
and <t\ , we can take in turn n< values of a\ , and then (23) can be replaced by 
(niTTi-ni) simultaneous equations (it would be necessary to use some formula 
.expressing dhis’^, x)/dx in terms of values of h(^, x) at discrete values of x or 
conceivably this may be avoided if we work with the integrated form) . With 
a proper choice of the points for x, s% , and , we might expect to evaluate the 
senes h(s°, x) to any required degree of accuracy, but clearly as a general process 
to be used over a whole range of values this approach Avould be too laborious. 
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It may indeed be queried whether theoretically, with an indefinitely fine 
network of points, we shall he led to a unique function x) with the comnaon 
sense properties, which, from general statistical considoi'ations, we know it 
should have in order to bn acceptable. As with integral equations of a simpler 
character, the passage from a discrete network to a continuum may raise prob- 
lems, but it is the author’s opinion that the infinite ranges of x and s^( give u.s the 
freedom which wo require in the solution. 

The author, however, prefers to approach the problem from the numerical 
behavior of the serie.s, of whicli (IS) givc.s the general terms. Here the practical 
issue appears to be to investigate the relation between the magnitude of the last 
terra retained and the f, . The author hopes in a further paper to give some 
results of an investigation of this character and also some tables facilitating the 
calculation of h(s‘, x). 
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PROBABILITYlSCHEMES WITH CONTAGION IN SPACE AND TIME' 
By F/slix CRnNUKCHi* and I^otur Ca.staonetto 
Harvard University 

1. Summary. In many'"natuval assemblies of elemimts, the probability of 
an event for a given element depends not only on the intrinsic nature of that 
particular element, but also on the states of some or all of the rest of the elements 
belonging to the same assembly. On the basis of tins general idea of “contagion” 
some urn schemes are developed in this paper in which one has e.ontagious 
influence in space and time. The most interesting result found is that in general 
the points of convergence of the probability of the assembly are given by some 
of the roots of an equation p = /(p) and that some of these roots, between zero 
and one, represent stable states of the assembly, or points of convergence, and 
others represent unstable ones, or points of divergence. The two neighboring 
roots, (if they are single), of a root representing a point of convergence are un- 
stable values of the probability. Consequently, under certain conditions, the 
limiting probability may be made to have a finite jump by changing the initial 
probability by an arbitrarily small amount. The concrete cases developed in 
this paper can be considerably extended by similar methods by assuming more 
complicated and general assemblies and laws of contagion. 

iQn the suggestion of the referee, some parts of tho original paper were deleted and 
some matheraalioal simpliQcations were introduced. 

* Research Associate at Harvard Astronomical Observatory and Guggenheim Fellow. 
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2. Introduction. In the known probability schemes of contagion of Eggen- 
berger and Polya [1], Greenwood and Yule [2], Luders [3],Neyman [4], Eellcr [5] 
and others [6], as well as in Markoff chains different ways are considered in 
which the pievious results in a definite series of trials may influence the proba- 
bilities of the future ones. All of these schemes consider possible influences of 
the results of the different trials along the tiine axis, and consequently might 
be called schemes of contagion in one dimension and one direction. 

In many natural assemblies of individuals or elements, the probability of an 
event per individual or element depends not only on the intrinsic nature of the 
considered clement but also on the states of the rest of the elements belonging 
to the same assembly. 

The purpose of this paper is to develop some simple schemes with urns in 
which there is a contagious influence in space and time and to show some of their 
consequences. The method which we have used to treat certain concrete cases 
could be applied to more complicated assemblies and laws of influence in space 
and time. 


3. Scheme of a closed assembly of urns in two dimensions. Let us consider 
a set of N urns arranged on a closed surface in such a way that each one of them 
is surrounded by m others. Let each uin contain a finite number of black and 
white balls. In this paper the piobabihty associated ivith an urn will refer to 
the probabiUty of obtaimng a white ball if a single ball is drawn at random from 
the urn. We shall assume that the initial probabilities are equal for all of the 
urns and that the following law of influence holds: When, after a collective 
trial, one finds that the ball drawn from a certain arbitrary urn, taken as the 
central one, is white and that the corresponding results of the m sui'rounding 
urns give I white and s black balls, one multiplies the probability of obtaining a 
white ball out of the central urn by the factor ,i aj , 2 ; if the ball drawn from the 
central urn were black, without changing the given results of the surrounding 
urns, one multiplies the considered probability by the factor Under 

the specified conditions, it is easily seen that the probability of obtaining a white 
ball from a definite urn at the i -f- 1 tiial will be, by considering all the possible 
alternatives : 


7H 1 

P,+1 = w 1 2 ttt ^ (p. 0(1,2)' + qi (p, aa.i)’"”' iq^. 012,2)'] 

(1) j-ojlUm — 3)\ 

~ /(pO = Pi(p<«i,i + gv 0(1,2)"’ -j- piqi{pi(Xi,i -f- gi 0:2,2)"’, 


where: 


P» + Si = 1. 

Consequently pi either converges to a root of the equation p = /(p) or tends to 
infinity. As a probability greater than one or smaller than zero has no meaning. 
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we have to study the function y - fip) between zero and one. In (1) we have 
given an implicit form for y = /(p), corro.sponding to a particular case of influ- 
ence; by changing the law of influence wc change the function /(p). In general 
one can find gi'aphically the roots of equation p = /(p) liy plotting y s= /(p) and 
y = p and by determining the intersections of these two linos in the range 
0 :< p < 1- Later we shall give the values of these roots for some concrete 
examples. From what we have shoum it follows that if, for the considered 
assembly of urns and for especially chosen values of the parameters of inter- 
connection and initial probabilities, the probability tends to some equilibrium 
value, this must be a root of tlio equation p = /(p) . As u'C shall see later, the 
roots in the range 0 < p :< 1 may represent stable or unstable states of the 
assembly. 

Let us consider now a general method for finding the exiilicit form of the 
function /(p) corresponding to laws of influence similar to the one used by Polya. 

Assume that the trial i re.sults in the drawing of I white balls and s black balls 
from the m urns surrounding the central one. Then we add Iwt white and 
shi black halls to the central urn if the result of the central urn ivas white, and 
h «2 white and sh black balls if it was lilack. It is easy to show that under these 
conditions the probability in the trial f -p 1 is related to the proliabilit.y in 
trial i by the following foimula: 


( 2 ) 


p,+i = pj +gAT 

Jo L 


dt 


+ (1 - p.) 

Ja 




1 dt 
Ju-la-l 


wlicro W, and iV, arc the number of white balls and the total number of balls, 
icspectively, in the central urn before trial i. Relation (2) permits us to study 
several interesting schemes. It is easy to sec that all the possible schemes which 
can be represented by relations of type (2) give only values of the probability 
in the interval zero and one; and consequently we do not need to make the 
restriction in the analysis of the equation p = /(p) that urns necessary in the 
previous scheme, represented by equation (1) , 

For the case Wi — hi = ci , lOi = h = Ci , we obtain from (2) 


(3) 


Pi+l = Pv 


Wx -t- mp, C l 
A, -f mct 


+ (I — Pi) - 


+ mci 


If Cl = C 2 , (3) gives 

(4) p<n = p, . 

If one takes Ci = fciAT, and cj = ln^Ni (3) becomes 


(5) 


Pi+i = Pi 


p, + mkip 
] + mfci 


- + (1 - Pi) 


Pi + mkip , 

1 + mki 


fiPi) 


and the equation p = /(p) has, in this case, the roots 0 and 1. 
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When wi — 1)2 — kiNi and = W 2 = fcjiV,, one lias to replace li(d/d(i} liy 
h{d/dh) in the second term of (2) ; then if ive take m = 2, 


(6) Pv+i = 


Vi + 2fci 

1 4- 2/cj 


+ PWi 


(r 


Pi + 


+ 21c2 


I -f- /ci 4" lC2 


- 3 


4- 2fci 

4" 2hi 






Inpaiticular, if /ci = /c2 = k, one obtains 


(7) p'+i = ~ + 2/c] = /(p.). 

and the solutions of the equation p = /(p) arc p = ^ and 1. lly considerinp; 
the behavior of p = /(p) one finds that the stable solution is given by the root ]■, 
consequently if one starts with any value of 0 < p < 1 the iirobability tends 
to the limiting value ^ If hi = 0, ^2 0, by simple calculations, one obtains 

from (6) that the solutions of p = /(p) , in this case, are zero and one 

The equation p = /(p), as given by (6), always has the solution 1. In order 
to have the other two roots real, one has to satisfy: 

^ci(l 4" 2 /cj) (2 d" fci 4" 3 /C 2 )* ^ 4(1 4" hi 4- fe) 

[{hi 4" fe)'* "h 2(fci — ki) — 4 /cj). 


A simple and interesting application of relation (2) is for the case of two urns, 
characterized by m = 1 . Prom (2) we obtain : 


(9) Pi+i 

where 




+ fcl. ^ 1 - 


4* fca 14" hi 




/(pi) 


wi = kiN I , hi — kiNi , W 2 = ksNi , Z 12 = kiN, . 

The equation p = /(p), as given by (9) has -the roots 0 and 1 ; and one may fix 
the value of the third root by conveniently choosing the values of the parameters. 

Applying (2) for an arbitrary value of m and integrating by parts, it is seen 
that in general the equation p = /(p) is of degree m -\- 2 and consequently, hy 
choosing appropriate values for the parameters h ,k 2 ,ka , ki , each of which may 
be between —1 and « , one can expect several roots in the range 0 < p < 1, 
One can easily generalize our relation (2) for cases in which Wi , Wz , bi , ba are 
given functions of the probability p, . Even in this most general case it is simple 
to see that ono would have a recursion formula of the type p,'+i = f(pi) and, as 
in the olcmentaiy cases which we have considered, the points of equilibrium of 
the closed assembly of urns will be given by those solutions, in the range 0 
< 1 , of the equation p = /(p) , whore the derivative of j/ = /(p) is negative. 
Consequently the two neighboring roots, if they are single, of a root representing a 
point of convergence are unstable values of the probability. Therefore, under 
certain conditions, the limiting probability may take a finite jump if the initial 
probability is changed by an arbitrarily small amount. This is, we think, the 
most important consequence of the contagion schemes that we propose. We 
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consider that many actual cases of contagion could be better understood by 
schemes of the type that we are studying. 

Let us consider now some simple cases of relation (1). If we take 


ai.i = «2,2 = Oil oii.j = 0 ( 2,1 = and m = 2, 


representing a closed ring of urns, one obtains: 


( 10 ) 


2h-ii = plioiip, + oi^qi)^ + p,q,{a2p, + aiq.Y 

= Pi + (Pi — p'i) [(«i + oia)” — 4, aj] = /(p,). 


The equation p = /(p), corresponding to this recursion formula, always has the 
solution p = 0. The other two solutions arc given by 


( 11 ) 


Pi, 2 = i 


1 zh 


y 4ai - 


— (oil + 


(ai + 0(2)“ J 


These roots will be between 0 and 1 when 


(12) 


2 < Oil -f- a2 

or (I2O 

1 > Oil < 0:2 


We would have Pi > 0 and P2 < 0 if 


(13) 


2 < «i + a2 
1 < ai < ai 


or (130 


and Pi = P2 when 


2 i> «! -h 02 

1 < «! > C(2 

2 > ai + Q!2 
1 > ai > 0(2 , 


(14) «! + 0(2 = 2, ai ^ 1. 

Let us now study the general behavior of (10). For the conditions (12') we 
have: 


(15) Vhi - P< = aV.(p< - Pi) (p< - P2) 

where = 4 aj _ 4 . > q. 


If 0 < Pi < P2, one obtains from (15) by use of elementary algebra: 


(16) 


p(+i “ Pi 


Pi - pi 


aV. 1 {Pi - Pi) \< 


a'Pl 


I 


Consequently if pi > P2 the sequence pi inci’eases raonotonically. Otherwise 
Pi+i win lie between Pi and p< and will tend to Pi without evoi’ reaoliiiig the othei- 
side of this point. In a similar way it is possible to prove the convergence to a 
constant for the most general equations of the tyiic p = /(p) when they have 
roots between zero and one. 

Let us give some numerical results. For ai = 0.96 and ^2 = I.i, from (10) 
one obtains: Pi = 0.1 and P2 = 0.9. It is easily seen that, in tlus case, if 
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0 < Pi < 0.1, the limiting value of pi ivill be zero, if pi > 0.1, the limiting value 
will be 0 9 The interesting point is that if the initial probability is in the 
neighborhood of 0.1, an infinitesimal change in its value may produce a finite 
change in the stable limiting probabilities; and that for the initial probability 
equal 0.1 one would have an unstable equilibrium of the system. This con- 
sideration shows why it is important to know how the probability p, converges 
towards a certain point. As we have previously shown, the points of con- 
vergence are roots of the cq. p = /(p) but there roots which arc not points of 
convergence. 

Similar reasoning could be applied to more complicated systems belonging to 
our general scheme of contagion Consequently, the most important result is 
not that the considered assembly may have a probability tending to some value 
in the range 0 < p < 1, but that under certain conditions the limiting probability 
may jump from one value to another by changing the initial probability by an 
arbitrarily small amount. 
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FITTING CURVES WITH ZERO OR INFINITE END POINTS 

Bt Edmund Pinnet 
Oregon State College 

The problem of determining a suitable equation to fit an empirically deter- 
mined curve over a given interval has been of great importance in statistical 
work, in expeiiraental science, and in engineering technology Since infinitely 
many types of equations may be made to fit the data with required accuracy, 
the choice of a “suitable” type of equation depends on the qualitative nature 
of the empirical curve, on the use to which the equation is to be put, and upon 
considerations of simplicity. 

As a function type, the polynomial has, because of its simiilicity, been enor- 
mously useful. The function type studied here is a little more general than the 
polynomial type, being particularly useful in the case of empirical curves that 
become zero or infinity at one or both ends of the interval. 

Without loss of generality the interval in which the equation is to fit the curve 
may be taken as 0 < a; < 1 It is assumed that, by numerical means or other- 
wise, a finite set of moment pm = / i/a?" dx may be computed, y being the 
ordinate of the empirical curve 
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The problem to be considered here is that of determining a function /(a-) of 
the form 

(1) f(x) = 3:“(1 - xf'^pUpX’’, R(a) > — 1, /i(/3) > — I 


such that 

(2) 



dx ~ Mm 


as m ranges from zero to tlic number of the highest moment known, f{x) is 
then an approximation to y which may be written 

(3) yp^fix). 


Theorem 1°. Given ajinite set of moments mo , mi i M 2 , ■ ■ • > Mn , and given that 
B(a) > —l,R{p) > —i, define 


(4) Sp{<x, 


jipj-a + i) ^ ±i» + 1) r_v» 

r(p + a + jS' +'l) 0 ” \m/ r(m+a+r) ^ ^ 


Oh — 


( 5 ) 


L-JL- 

k\T{k + a+l) 


( 2 p + a + ^ + l)r(P + + « + /^ + 1) tw o\ 

■ r” ~ (V- k)\v(p + + 


( 6 ) m = x^a-xYi:hat^xK 

Q 


Thenfiz) will satisfy (2) for m = 0, 1, • - • , n. 

2°. If, in addition to i°, nn+i is known and a and p satisfy 


(7) 


8n+i(a, P) = 0, 


thenfiz) will satisfy (2) for m - n + 1 also. 

3°. If, in addition to 1° and 2°, is also known, and if a, p also satisfy 

(8) Sp^iia, p) = 0, 

then fix) will satisfy (2) for m n + 2 as well. 

Proof. Let Pi“'^\z) bo the Jacobi polynomial of order m defined in terms 
of tlio hyiDcrgeomctric function by 

(9) ^ n-m, m + a + p + l-,a + l-,h- ?jz). 

Let — 2 m) symbolically represent the expression gotten by subBtituting 

ftk for Xh in the expansion of the polynomial (1 — 2x) . There exist numbers 
24m, a such that 

m 

a:’" = - 2x). 

0 


(10) 
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Mm = - 2 m). 


ForP(Q!) > — 1, P(/3) > —1, define 

_ ««/'i _ Y' (2p + a + /? + l)pir(p + tt 4- i3 + 1) 

(12) ^ ^ 0 V(p + a + l)r(p + ^ + 1) 

X P^p“''*'(l - 2 m)P'“-'‘'( 1 - 2a;). 

Then by (10), for w = 0, 1, ■ ■ • , n, 

- $» ;+ 1) ^ 

w a 1 

X z* a;“(l - xYPi‘‘‘^\l - 2®)Pi‘'''’\l - 2a:) dx. 

0 *J0 

By the orthogonality of the Jacobi polynomials, [1; §4.3], 
fmx’^dx = Ep^„,,pi“*^^(i - 2 m). 

Jo 0 

By (11), 

[ /(®)® dx = Mm » (w = 0, 1, • * ■ , n.). 

Jo 

It follows from (2) that /(a;) as defined in (12) is the f{x) of ( 1) . It remains to be 
shown that (12) may be expressed in the form (4)-(6). 

From (9), 

p'«'«n - 2a:) = r(p + g + i) 

(13) " r(p + a + ^+l) 

x> (-)”• r(p + TO + a + /3 4-l)„m 

n/-Z I r. 1 1 ~\ f 


0 *" ml (p — m) ! r(m + a + 1) 


so by (4), 


P^“'^>(1 - 2m) = SM /3). 

pi 


Inserting (13) and (14) into (12), 

r/ \ a/1 2p + a + /? + 1 

f(x) - X (1 x) Zl. ^ ^ 1 ) 


(_)A p(p ^ 01 + ^ d~ 1) 


V Z i;.p-r/c-rot-rp 

^ V^/cKp - fc)! r(fc + a + 1) 


a:*Sp(«, /3) 


ft / '1*0.*' 

= a:"(l - a;)^ Z^ I ' - Jn 
0 fc!r(fc + a + 1 


V klT(k + 0£ + 1 

^ (2p + « + |3 + l)r(p + A; + « + /3 + l) 
ft (p - fc)!r(p + /3 + 1) 


5p(q;, /3), 
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= a;“(l - xf Ei 
0 


by (5), so the /(a;) of (12) may be expressed in the form (4) -(6), and part 1“ of 
the theorem is established. 

If (7) holds, by (5) , = 4"^ for A: = 0, 1, • ■ • , n, and a ^ = 0. There- 

fore, in (6), 


li+t 

fix) = a;“(l - x)‘‘ E* 
0 


and by part 1°, for the case in which n is replaced by n -f 1, it follows that (2) 
holds for m = n -h 1, so part 2° is established. The establishment of part 3“ 
is essentially the same. 

In applying this theorem to the problem of empirical curve fitting, it follows 
from (6) that the constants a and p should differ from zero only if the empirical 
curve approaches zero or infinity at one or both of its endpoints. With this 
in mind the foUowing rules may be stated; 

Case A. If, in the empirical curve, /(O) 5^ 0 or <» , and /(I) 0 or m , set 

a = /3 = 0, and let n be one less than the number of moments that it is desired 
to fit. 

Case B If /(O) == 0 or m and /(I) 5^ 0 or «> , set y3 = 0 and determine a from 
(7), n being two less than the number of moments that it is desired to fit. 

Case O'. \i /(O) 5^ 0 or 00 and /(I) = 0 or « , set a = 0 and determine 
from (7) , n being two less than the number of moments that it is desired to fit, 

Case D. If /(O) = 0 or « and /(I) = 0 or w , determine both a and /3 from 
the two equations (7) and (8), n being three less than the number of moments 
that it is desired to fit. 

It may happen that these processes cannot be carried out, or at least cannot be 
conveniently carried out. If this is the case, a or /3 may be set arbitrarily and n 
taken as one unit higher than before, or both a and (3 may be set, and n taken 
as two units higher than before. 

In Case D, above, the solution of equations (7) and (8) may often prove 
difficult, making it advisable to follow the suggestions of the last paragraph. 
In certain special cases, however, their solution is not difficult. 

Suppose, for example, the moments satisfied the equations 



If this is substituted into (4), and the order of summation reversed, on making 
use of the identity 

(' 16 ') y (-y ~ (-Y r(ot)r(g - v -h i) 

0 \P/ r(p -I- r ) ^ T{a-v-n + I)r(» -t- v) ’ 


one obtains 
(17) 


Spia,^) = i~)^Spiff, a). 
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Therefore 

(18) S 2 p+i(a, a) = 0. 

When n is an integer, either n + 1 or n + 2 is odd. Therefore when (15) 
holds, one of either (7) or (8) will be satisfied identically if we take /3 = a. The 
other may then be solved for a. 

As an example, suppose one had the moments mo = 1, Mi = i , M2 = tT) Ms — Aj 
M 4 = and wished to obtain an f(x) such that /(O) = 0, /(I) = 0. In this 
case n = 2, and (15) is satisfied. It follows that (7) is satisfied identically when 
(9 = a, and (8) gives 

r(2a + 5) r(2o^ + 6) / l\ r(2a + 7) /7\ 

r(a + 1) r(a + 2) V 2J'^ r(a + 3) V24/ 

Ti2a + 8) /__3\ r{2a + 9) ^ 

r(a + 4) \ 16/ r(a + 5) \240/ 

This easily reduces to 

1 - 4 ^ + 5/2 I » (g + 5/2) (4. + 3) 
g + 1 (g + l)(g + 2) 

_ ^ (g + 5/2)(g + 7/2) , 31 (g + 6/2)(g + 7/2) _ ,, 

(g + !)(« + 2) 240 (a + l)(a + 2) 

which reduces to the quadratic 

4a* - 6g + 5 = 0, 

from which 

(19) a = ^ = 3/4 ± (l/4)Vnt. 

These may be substituted into (4)-(6) to complete the solution, 
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CONSISTENCY OF SEQUENTIAL BINOMIAL ESTIMATES 

By J, Wolfowitz 
Columbia Umversily 

The notion of consistency of an estimate, introduced by R. A. Fisher, applies 
to a sequence of estimates which converge stochastically, with boundlessly 
increasing sample size, to the parameter (or parameters) being estimated. Each 
estimate is a function of a sample of observations, the number in each sample 
being determined independently of the observations themselves. In sequential 
estimation, on the other hand, the number of observations is itself a chance 
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variable, determiiied by the sequence of observations and the application to 
them of a rale which may be part of a sequential test. In what follows we 
consider that the operation of sequential estimation is associated with a 
sequential test.^ 

The advantage of using consistent estimates is such as to suggest extension 
of the idea of consistency to sequential estimation. In the present paper we 
shall be concerned only with the estimation of a binomial probabpty (p say). 
The obvious extension is that a sequence of estimates, each with its associated 
test, is consistent if the estimates converge stochastically to p. 

Since the number of observations required by a sequential test is a chance 
variable, a parallel to the classical sequence of samples of increasing size would 
be a sequence of sequential tests whose average (in some sense) sample sizes 
increase without limit. It seems reasonable to as.sociate only such a sequence 
of estimates with this sequence of tests as ivill converge stochastically to p, 
i.e., be consistent. 

Let z be a chance valuable which takes the distinct values Ci and Ca with proba- 
bilities p, 0 < p < 1, and 3 = 1 — p, respectively. Let Zi , • • • , z„ be a sequence 
of independent observations on z which terminates with the nth according to the 
specific sequential test under consideration. Denote by z and y, respectively, 
the number of observations Cs and c\ in this sequence. Then z, y and n = z + y 
are all chance variables. The couple g == (x, y) is called a boundary point of 
index n (see [1]) . The sequence of observations which terminates at g is called a 
path. Let k(g) denote the nurabei- of paths which terminate at g, and let k*{g) 
denote the number of these paths whose hint observation is ci . The “points” 
on the various paths together with all the points g constitute the “region” under 
discussion. 

Let P[n = j} denote the probability of the relation in braces. If 

the region is called closed. Only closed regions will be considered below, so that 
this assumption will henceforth be made without explicit formulation. It has 
been shown by Girshick, Mostellei', and Savage [1], that p(,g) — k*{g)/k{g) 
is an unbiased estimate of p for any closed region R, i.e., 

S p(ff)fc(fif)?>Y ® Pi 

where the summation takes place over all the boundary points g of H. For 
many important regions this estimate is the unique unbiased estimate. 

Let there be given an infinite sequence of sequential testa with each of which 
we associate the estimate p{g) . Conmder the fth one of those, and let noi be 
the smallest number of observations required for a decision, i.e,, nui is the smallest 

^ Really all that is required is a rule for terminating the observatione such that its region 
B is closed (see below). However, we defer to conventional statistical usage in rofornng 
to “tests ” 
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value of j for which P{n = j} 4 = 0 The theorem proved below asserts that if 
no, approaches infinity with i the estimate p(g) converges stochastically to p. 
To put it m other words • if Ti , Tj , ■ • • is the sequence of tests, and €i and ea 
are arbitrarily small positive numbers, there exists a positive number J(ei , ea) 
such that, for all such that i > J, 

^{| p(a) - P I > «ii <62, 

when no, . An important example of such a sequence is that of the Wald 
sequential binomial tests [ 2 ] obtained as follows; Let ai , • , a, • ■ and 

) 3 i , /32 > ■ • j ' , be two sequences of positive numbers all of which are less 

than i and which approach zero as i — ^ Let po and pi , 0 < po < pi < 1 , 
be two fixed numbers. 

Cl = log J, C2 = log = 

PQ (1 — Po) i-1 

Finally let the rule for terminating the process of drawing observations be as 
follows for the zth test Ti : The process of drawing observations terminates at 
the smallest integer n for which either 

> log or < log . 

Since (1 — P,)/a, — > CO and ^,/(l — a,) — > 0 while Ci and C2 are constant, it is 
evident that the hypothesis of the theorem is satisfied. 

The property of being unbiased is not geneially considered an indispensable 
characteristic of an optimum estimate, while consistency is generally so regarded. 
Our theorem shows that p{g) enjoys the latter property with respect to important 
sequences of sequential tests. 

Theohem: Let Ti, ■ ■ , T, , • • • be a sequence of sequential binomial tests. 
For the ith test T, let no, be the smallest iniegei such that P\n = no,} + 0 . Finally 
let no, — > 00 as f — > co , Then p{g) converges stochastically to p as i qo . 

Phooe: For typograpluc simplicity we shall use no as the designation of the 
generic element of the sequence woi, no2, • • • . No confusion will be caused 
thereby. 

Let n' = no — 1 , and Si > 0 and 62 > 0 be arbitiarily small fixed numbers. 
Let /c'(g) be the number of paths which end at the point g and are such that 
1 y'/n' — pi < Si , where y' is the number of observations Ci among the first n' 
observations. We then have 
Lemma 1 . For no sufficiently large 

(1) L h'ig)pY > 1-52 

where B is the set of boundary points of R. 

PnooE: Consider the totality {?i} of all points h = (x', y'), with x' + y' = n'. 
Here x' and y' denote, respectively, the number of observations C2 and ci in the 
sequence of the first n' observations on 2. Let ko{h) denote the number of paths 
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to h. Let C denote the set of points h such that \ y'/n' - p \ < . If no ig 

large enough we have, by the law of large numbers, 

Z) > 1 - 52 ■ 

htC 

Let g) be the number of paths from h to g. From Theorem 2' of [3] it 
follows that 

(A) Z) Kh o)v*(f = P’'*?*'- 

' ' at u 

Also from the definitions of the various symbols involved it readily follows that 

fc'(ff) = Z) hDQi)k{h, g). 


Hence 

Z = Z (Z Uk)k{h, g))vY = Z (Z Uh)Kh. g)pY) 

5«2I hto 

= Z Uh)(L k(h> g)pY) = Z hWq^' >1-5,. 

htC OiB Btc 

This proves Lemma 1, 

Let ^{g) = [kig) - k'{g)]k{g). Thus ^{g) is a chance variable, being a function 
of the chance point g. 

Lemma 2, Lei St and St he arhilranly smaU posilive numbers. For no suffusienay 
large 

(2) PU(ff) g M > 1 " «*■ 

Peooe: If (2) were not ti-uo, wo would have 

(3) B ^ = S k'ig)pY < (1 - St) + (1 - S>)8t = I ~~ St St. 

k{g) 

Choose the St of Lemma 1 so that Si < SjSo . For some large value of no we 
would then have a contradiction between (1) and (3) , This proves the lemma. 

Let g be any boundary point. Conader any path whose y' is such that 
I jy p I < 5i ; let us call such a path one of type T. Consider the terminal 
sequence S of this path, 

^ * ®n(j , ^1*0+1 j * * * ) 

This sequence, together with g = {x, y), uniquely determines y'. _ Any permuta^ 
tion of y' elements ci and n' — y' = x' elements cu may servo as the initial sequence 
of n' observations of a path which terminates at g and has the terminal sequence 
S. For no boundary point is of index smaller than tIq , so that under permuta- 
tion of the first n' observations a path remains a path, i.e., the process of taking 
observations mil not terminate prematurely as a result of the peiTOUting of the 
elements. Of these permutations a proportion y'/n' begin -with the element ci . 
We deal in this manner with all the different terminal sequences of the paths of 
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type T which, end at g. Let k*'{g) be the number of these which begin with ci . 
We obtain 

Lemma 3. For all g such that k'(g) 0 


k*'(g) 

m 



< Si. 


Putting Lemmas 2 and 3 together we have 

Lemma 4 As no k*'(g)/k{g) converges stochastically to -p. 

Now it follows in a manner similar to that of Lemma 2 that, as no — > “ , 
k*'(g)/k*{g) converges stochastically to one. This, together with Lemma 4, 
proves the theorem. 
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This booh; represents a contribution of a novel kind to tlu' stalistic-al liffraiure 
and will render valuable services l)Oth as textbook and referonce book. Of Us 
three parts the first one (134 pages) is entitled Uailicmnlicnl Intmbtrlum and 
develops the necessary formal mathematical tools. The second part. (180 pages) 
is devoted to Random Variables and ProbdbiUly DislribulioiiH, that i.s to say, to a 
chapter of the modern theory of probability, The third, and main, jmrt of the 
book (some 233 pages) is entitled Stalislical Inference. Ordinarily the«c‘ thTO 


topics would require consultation of three or more hoots, and thest* would rarely 
be found on the same shelf. However, the masterly exposition succeeds in creat- 
ing the impression of natural unity and harmony. The ideas sire developed with 
elegance and apparent ease as if the line of presentation followed a well oxj)lored 
path The uninitiated will not notice how unconventional the I real men t is and 
how the very selection of topics depends on the author’s scientific personality. 

It is hardly necessary to point out that Cramer’s book fills an urgent imed. 
The emergence of statistical theory and methodology us an c.wt scitmee, firmly 
grounded in mathematical probability, is only of recent date. Its rapid develop- 
ment went hand in hand with an extraordinary inorofiso of the number anti im- 
portance of its various applications. Under such circumstances there was 
naturally little tune for an exposition of the theoretical foundations and ramifi- 


cations Modem statistical inference has its roots in the chussical limit theo- 
rems of probabihty. Now classical probability used to consist of a bmvildcrhig 
collection of special and mutually uncon'elated problems; unified guiding piinoi- 
pies and methods are a rather new development and have not yet found expression 
in the textbook literature. The original investigations are usually ivritten in an 
exceedingly abstract language and the existing close ties to applicatioim are not 
apparent, _ Consequently, there is no easy access either to probability or statia- 
^ tics and it is often difficult to establish ivhether, or to what extent, various asser- 
tions have actually been proved. The present book therefore closes a serious 
gap m the literature and will greatly facilitate both teaching and research. 

Of the 12 chapters of the Mathematical Introduction 9 are devoted to the theory 
of measure and mtegration. The antiquated theory of the so-called Riemann 
mtegial (kept ahve by elementary textbooks) considered only point functions 
y - fix), where the independent variable is a point, The teraporaturo at a 

T ™ “°i“ent are typical examples. Many 

mathematical considerations simplify gieatly if from the very beginning also set 
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functions y = i''(A) arc introduced, where the independent variable is a set. 
Typical examples are mass in mechanics, the amount of heat or of electricity, 
area or wealth of a geographic region, and the probability of events (i e. sets in 
sample space) . The Lebesgue-Stieltjes theory frees the concept of integral from 
artificial devices and reduces it to the natural notion of mean values with respect 
to set functions. In a simile, believed to be due to Lebesgue, the Riemann inte- 
gral corresponds to the procedure of a grocer who computes the day's receipts 
by actually adding the several amounts in the order as they had come m The 
Lebesgue procedure imitates the more intelligent grocer who orders his cash in 
piles of notes and coins of equal denomination and counts them. The analogy 
with the customary procedure of computing mathematical expectation is clear. 
The Lcbcsgue-Stieltjes integral is conceptually simpler than the Riemann integral 
and can be presented in as simple a way with rigor adequate for elementary text- 
books It has become nn indispensable tool in probability, statistics, physics, 
and other applied fields Since it has, unfortunately, not found its way into 
calculus textbooks, physicists are compelled to use the less flexible notion of the 
Dirac 5-fnnction, and the formal mathematical appaiatus in general becomes 
unnecessarily clumsy. It is a curious anomaly that so many calculus textbooks 
profess to be written with a view^ to applications and yet completely disregard 
the most obvious practical needs and that the teaching of practical mathematics 
should remain uninfluenced by the great developments of the last fifty years, 

In such circumstances the chapters on integration will be particularly w'elcome 
to statisticians as probably the only place in the literature where they wdll find 
easy access to the theory Of course, this exposition leads far beyond ivhat the 
average statistician will require under ordinary circumsLances and beyond the 
necessary prerequisites of the main body of the book. Of the 88 pages roughly 
half can bo omitted at first reading in accordance with detailed instructions given 
in the Preface. The remaining half will form a valuable reference book for 
theorems and tools used occasionally in connection with more delicate parts of 
sl.atistical theory. The mathematical introduction contains also a chapter on 
Fourier integrals (characteristic functions), one on matrices and quadratic forms, 
and finally miscellaneous complements such as orthogonal polynomials, Euler’s 
summation formula, beta and gamma functions, etc. 

The title to the second part, Random Vanables and Prdbahility DisLnbulions, 
is the same as that of the author’s w'ell-known Cambridge Tract of 1937. Both 
start with a discussion of the foundations along axiomatic lines The new treat- 
ment does not differ essentially from the old one, but some changes are intro- 
duced which are regrettable in the reviewer’s opimon (in particular axiom 3). 
Otherwise there is practically no overlap between the two expositions. The 1937 
booklet devoted much space to the asymptotic expansions connected with the 
central limit theorem which are due to the author himself. This topic is not 
touched upon in the present book. This is a judicious procedure since the 1937- 
booklet is generally accessible (although at present sold out) Instead we now 
find a detailed study of some univariate distributions such as Student’s t, 
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Pislier’s z, the Pearson system, etc., none of which were mentioned in the C'um- 
biidge tract Similarly, there is now a section on correlation and regression, 
and the normal distributions in several variables. The theory of probability is 
developed only to the extent of the formal theory of distribution functions. This 
implies that even so important a notion as stochastic convergence i.s treated only 
summarily while the strong law of large numbers falls completely outside the 
framework of the book. This is regrettable inasmuch as the strong law is of 
greater importance than the classical weak law (whose fame rests essentially on 
a classical misunderstanding). It should be mentioned that this second part of 
the book contains some 39 well chosen illustrative exercises the solution of which 
is left to the reader. 

In the main part of the book, entitled Sialislical Inference, the outer form 
changes inasmuch as the text there is accompanied by numerous practical exam- 
ples However, the exposition remains mathematical in nature and the main 
emphasis rests on exact formulations; much attention is paid to the establishment 
of the precise conditions of validity of the individual theorems, their logical 
interrelations and their connections with general probability. The expert will 
find many minor and major improvements in formulations and proofs. They are 
too numerous to be listed here. Suffice it to point out, as a typical example, the 
theorem on pp. 426-27 concerning the limiting form of the x“ distribution ivith 
estimated parameters; tliia theorem appears to bo more general tlian usually 
stated and also the proof seems to be novel. The topics treated in the statiBlical 
part of the book will be seen from the following list of titles to the chapters. 25. 
Preliminary Notions on Sampling. 26. Statistical Inference (general orienta- 
tion) . 27. Characteristics of Sampling Distributions (moments, semi -invariants, 
corrections for grouping, etc.). 28. Asymptotic Propovties of SamiiUng Dis- 
tributions (moments, extreme values, range, etc.). 29. Exact Sampling Distri- 
butions (degrees of freedom. Student, Fisher, correlation and regression coefli- 
cients, partial and multiple correlations, generalized variance, etc,). 30. Tests 
of Goodness of Fit and Allied Tests (treating mostly applications of x^)- 31- 
Tests of Significance for Parameters. 32. Classification of Estimates (suHicient, 
efficient and asymptotically efficient estimates, minimum variance, etc,). 33. 
Methods of Estimation (method of moments, maximum likelihood, x^’Uiinimum 
methods) 34 Confidence Regions. 36. General Theory of Testing Statistical 
Hypotheses. 36. Analysis of Variance. 37. Some Regression Problems. There 
follow tables of the normal distribution, the x'* and the t-distributions, and a long 
list of references. 

If an expression of wishes for a second effition were ponnitted, most statisti- 
cians would probably give first choice to non-paramotrio and sequential tests. 
It is needless to point out that the latter became public only after completion of 
the Swedish edition of the present book 

Even this short account will show the extremely wide range of topics and 
theories covered in the book, from abstract integration to randomized experi- 
iticms. They arc all presented with uniform lucidity. The exposition through- 
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out IS formal, and yet inspiring, rigorous and yet never pedantic It will serve 
as an example worthy of imitation and is an achievement on which the author 
deserves our sincere congratulations 


The Advanced Theory of Statistics. Vols. I and II. Maurice G. K&ndall. 

London: C. Griffin and Co., Ltd. Vol. I. Second ed revised, 1945; pp. xii, 

457, 50 shillings. Vol. II. 1946; pp viii, 521; 42 shillings. 

Reviewed bt M. S. Bartlett 
Camhndgc University and The University of North Carolina 

With the recent appearance of the second volume, it is now possible to review 
as one work this comprehensive treatise To quote the author’s opening re- 
marks to the Preface to Volume I: “The need for a thorough exposition of the 
theory of statistics has been repeatedly emphasized in recent years. The object 
of this book is to develop a systematic treatment of that theory as it exists at the 
present time." An outline of tho contents, which in the two volumes make up 
just on a thousand pages, will indicate that this formidable task has been squarely 
faced by the author, who, when a tentative co-operative venture of writing such 
a treatise was upset by the outbreak of the war, continued alone with the project. 

Volume I contains sixteen chapters. The first six introduce the concept of 
frequency distributions via observational data on groups and aggregates, and 
their mathematical representation (Ch. 1), measures of location and dispersion 
(Ch. 2) and moments and cumulants in general (Ch. 3), characteristic functions 
(Ch. 4), and ending with a description of the standard distribution functions, such 
as the binomial, Poisson, liypcrgeometric and normal distributions, and the 
Pearson and Gram-Oharlior systems The next section opens with probability 
(Ch 7) and proceeds to sampling theory (Chs 8-11), including a chapter (Ch. 10) 
on exact sampling cUstributions, many of the standard sampling distributions 
being used in this chapter to illustrate the mathematical methods available for 
obtaining sampling distributions. Chapter 11 deals with the general sampling 
theory of cumulants, including a useful reference list of formulae and a demon- 
stration, due to the author, of the validity of Fisher’s combinatorial rules for 
obtaining these formulas. The section concludes with a chapter on the Chi- 
square distribution and some of its applications. The last four chapters of 
Volume I deal with association and contingency, correlation, including partial 
and multiple correlation, and rank correlation; this last chapter being a compre- 
hensive treatment including comparatively recent results of the author. 

It will bo convenient to list also the contents of Volume II before any critical 
comment on either volume. The first section of the second volume comprises 
four chapters on the theory of estimation, including a derivation of the properties 
of the maximum likelihood estimate (Oh. 17) and separate chapters on Fisher’s 
theory of fiducial probabihty and Neyman’s theory of confidence intervals. The 
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second main section, according, to the author’s remarks in the preface to Volume 
11, deals with the theory of statistical tests and comprises chapters 21, 23, 24, 26, 
27 and 28, of these after an introductoiy chapter (Ch. 21) on tests of significance, 
chapters 23 and 24 cover analysis of variance, chapters 26 and 27 give a fairly 
detailed account of the general theory of significance-tests originated by Neyman 
and Pearson and Chapter 28 deals with the recently developed techniques of 
multivariate analysis. The remaining chapters are 22 on regreasion, 25 on the 
design of sampling enquiries, and Chapters 29 and 30 on time-series, another 
subject in which the author has himself taken an active interest. Finally, there 
are two appendices, A consisting of a few addenda to Volume I, and B an exten- 
sive bibliography of theoretical statistical papers 
The volumes are attractively printed; and each chapter concludes with a useful 
collection of examples for the reader 


In any comprehensive treatment of a wide subject there can be no clearly de- 
fined order ol presentation ; nevertheless, the author’s order of chapters in Volume 
II and in particular his inclusion of analysis of variance among the chapters on the 
theory of statistical tests is a little puzzling, and the reviewer’s preference would 
have been to see this important subject treated earlier, together udth regression 
analysis, and their link with the classical method of least squares mom firmly 
outlined Incidentally, there appears to be no mention of the Fourier analysis 
of observational data except in its relation to periodogram analysis (C!h. 30) . 
This change of order would perhaps also have allowed a shift forward of Chaptisr 
26 on the design of sampling enquiries, and a more compact section on multiple 
correlation, culminating with the chapter on multivariate analysis hoforo tlio 
chapters on the general theory of statistical inference were begun. 

Another arrangement of rather doubtful value in Volume II is the allocation of 
separate chapters to fiducial probability and to the theory of confidence inter- 
vals. The problem of how to deal with a field which is still a battleground is 
admittedly not an easy one, and this particular one is an embarrassment at 
present to many teachers, but it may be questioned whether strict impartiality 
is the best answer. To take a hypothetical example, there would seem to be no 
particular virtue in a textbook which expounded, in parallel, .statistical methods 
of inference using direct probabilities and the method of “inverse probability” 
leaving the reader to decide at the end which he should adopt. 

The most cnticizable arrangement, however, occurs in Volume I with the late 
and rather scanty treatment of probabihty in Chapter 7 To begin with ex- 
amples of statrstical data is sound, but since the whole conceptual model erected 
for such data is based on probabihty theory, it does not seem sumcient 

for a reader who feels keenly on the subject” to do as the author suggests in the 
Preface and read Chapters 7 and 8 after Chapter 1. Even if ho does so, he will 
find no ve^ clear exposition of the statistical theory of probability, -no mention, 
or example, of the laws of large numbers, whether for simple dichotomies or for 
entire continuous di.-inbution functions, that show how the conceptual model 
.)< equutoji conc-oonds with the empirical notions of “m the long run” or “for 
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a large enough sample” The actual arrangement, moreover, leads to an ap- 
parently rather arbitrary treatment of theorems on limiting distributions; 
the First Limit Theorem, which deals with the equivalence of the limits of dis- 
tribution function and corresponding characteristic function sequences, is given 
in the chapter on characteristic functions (Ch 4), and the Central Limit Theo- 
rem, dealing with the convergence to normality of a sum of n independent random 
variables, is given in the chapter on probability. 

In the proof of the second part of the First Limit Theorem, dealing with the 
conditions under which a sequence of characteristic functions determme 
the limiting distribution function F (a;), the author has not yet corrected an error 
that occurred in Gram6r’s original version, which Kendall follows (section 4.12). 
Correct conditions for convergence of the distribution function sequence Fn{x) to 
F(n!'.) (at all continuity points of F) are convergence of the characteristic function 
sequence to 0(f) for all real t, unifoi-mly in at least some finite t interval (of. H. 
Scheff6, Math. Reviews, Vol 6 (1945), p. 89). 

Another proof in Volume I which appears to need clarification is the geometri- 
cal derivation of the distribution of the multiple correlation coefficient in the case 
of a non-zero true correlation (section 15.21). The blunt statement is made, 
following equation (15.51), that the sample correlation coefficient R and an angle 
0 (defined in the te.\t) arc independent, a statement which is incorrect. How- 
over, if the logic of Fisher’s original derivation is examined, it turns out that the 
relation of R and 0 is only reciuired when the true correlation is zero; under such 
conditions R and 0 arc independent 

In Volume II there is a sentence requiring correction and amplification in the 
derivation (in the case of zero true canonical correlations) of the sampling canoni- 
cal correlation distribution (section 28.30). The sentence “Consider the dis- 
tribution for a given value of and z„ • • • ” should be corrected to read "Con- 
sider the distnbution for a given value of f;, Zi, ■ Some iustilioation that 
the distribution is independent of t„- -t- z„ is then still needed 

There is inevitably, owing to the time the book was written, no mention of 
sequential analysis, the sampling technique developed during the war by Wald 
and others and only recently “derestricted”. Again, in chapter 18, where the 
work of Aitken and Silverstone on unbiased estimates with minimum variance is 
referred to, the simple inequahty connecting the variance of any unbiased esti- 
mate with Fisher’s information function throws an interesting new light on this 
aspect of the estimation problem (sec, for example, H. Cram6r, Mathematical 
Methods of Stalislics, section 32 3, or C. R. Rao, Bulletin Calcutta Math. 6'oc., 
Vol. 37 (1945), p, 81), but was not known to the author when this chapter was 
MTitten. Such omissions are merely an indication of the developing nature of the 
subject, and it is hoped they can be remedied in later editions. There is, hoiv- 
ever, especially in Volume II, an occasional impression of patchiness in the treat- 
ment not altogether excusable on such grounds. This can perhaps be illustrated 
from the last chapter, a valuable contribution to the still-growing subject of 
time-series, but where the importance of some known results does not always 
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seem sufficiently stressed, in particular, the Wiener-Khintchine relation between 
the periodogram and correlogram is noted (section 30.68) as an interesting re- 
lation”, whereas it is a fundamental relation in the modern method of approach 
to time-series, giving much deeper insight into the correct interpretation of 
classical periodogram analysis. 

These criticisms, which could be extended to cover minor errors and mis- 
prints, are not intended to detract seriously from what is a remarkable achieve- 
ment An excellent sense of proportion has been maintained throughout be- 
tween mathematical theory and illustrative discussion and examples , This makes 
this treatise, if both the breadth and level of the subject matter are taken into 
account, at present umque. It will be an indispensable reference book to every 
teacher and advanced student of the theory of statistics. 


Sequential Analysis of Statistical Data: Applications. Prepared by the 
Statistical Research Group, Columbia University for the Applied Mathe- 
matics Panel, National Defense Research Committee, Office of vydentific Re- 
search and Development. SRG Report 265, Revised; AMP Report 3().2R, 
Revised New York: Columbia University Press, September 1945. pp. vii, 
17, iv, 80; V, 57; iii, 25; lii, 18, in, 39; ii, 41. $6.25. (London: Oxford Uni- 
versity Press, 1946.) 

Reviewed by John W. Tukey 
Princeton University 

Many of the features of this compendium are familiar to most of the readers of 
this review, but for the benefit of the others I shall enumerate them briefly, It 
consists of a heavy looseleaf binder containing 7 booklets of distinctive colors — 
each saddle stitched and usable separately. It is the last word (to date) in pre- 
senting sequential analysis to the statistician who may wish to use it in practice. 
It covers five elementary cases (each in a booklet, the two others being used for 
introduction and appendices) ; 

Acceptance or rejection by percent defective (See, 2) 

Comparative percent aatisfactory (Double dichotomy) (Sec. 3) 

Acceptance or rejection by the adequacy of the mean (with known variability) 

(Sec. 4) 

Acceptance or rejection by the exact value of the moan (with known variability) 

(See 5) 

Acceptance or rejection by the smallneas of tho variability (Sec, 6) 

These cases are covered in complete detail, with illustrative examples, tables and 
charts. A copy should be accessible to every teacher of statistics and to every 
statistician in industry or experimental work who can propose new techniques of 
testing. 
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With this general introduction let us go on and explain what the reader will 
not find and what further work in this line the reviewer awaits with keen interest. 
The classical testing procedure was to test a sample of predetermined size and 
then decide to accept or reject. Long ago curtailed sampling and double samp- 
ling were developed to cut corners legitimately and reduce inspection costs 
There are two situations, each more frequent in war than in peace, where it is 
clearly desirable to reduce the average number of iteftis tested to a minimum: 

(I) Where essentially all lots are accepted and the test is destructive so that the 
items tested are the main loss of production, or 
(II) Where the cost of testing an item is large in comparison with the cost of 
production. 

Subject to a practically unimportant allowance for the finite size of the lots, and 
to an allowance of unknown importance for the quality of lots presented, the 
methods of sequential analysis minimize this average number among all methods 
so far considered. When situation (I) or (II) holds without modifiying complica- 
tions, then, the best known method is sequential analysis, the natural descendant 
of double sampling Otherwise, the situation is far from clear, and much judge- 
ment is involved in setting up a practically efficient scheme. The reader will 
get no help on this problem of judgement, nor in the problem of setting risks from 
the book under review — ^he will get every needed help with the mathematical 
problem of setting up a sequential plan to meet chosen risks, including complete 
tables of all necessary functions, including natural logarithms 
There is no reason to suppose that sequential analysis is the last word in testing 
procedures for the general problem of efficient testing, but what should be the 
next step ahead is not a step for the mathematical statistician. What is needed 
now is a careful analysis, by the operational research techniques so useful during 
the war, of a half-dozen industrial testing situations to determine what properties 
of the testing procedure are involved in cost and to what extent. Do we want 
the minimum average sample size, the minimum average square of the sample 
size — or what? With this there should go a corresponding operational study of 
the advantages of different OC curves, including those of what now seems to be 
a peculiar shape. Given these studies, we could put the problem in mathematical 
statistics to the mathematical statistician which he would then solve. But with 
the present lack of operational research groups in industry, it is probable that 
we will proceed in an unnatural way, and that the mathematical statistician -will 
take the next step forward. For reasons of mathematical simplicity it is not 
unlikely that the sample plan with the minimum average squared sample size 
ivill come next. 

The credit for the book is clearly assigned on the inside cover of each pamphlet 
in the following words: “So many members of the Statistical Research Group 
(Columbia) have participated in the preparation of this report, a previous edi- 
tion of which Avas prepared by H. A, Freeman, that its authorship is attributed 
to the group as a Avhole The responsibility for planning and preparing this 



144 


BOOK REVIEWS 


edition has been shared by PL A. Freeman, M A. Girshick, and W. Allen Wallis, 
with the cooperation of Kenneth J. Arnold, Milton Friedman, Edward Paulson, 
and others. The theory of sequential analysis is mainly the work of A. Wald.” 

It may be of interest to notice a few minor points for the record. On page 
1,01 it is indicated that 100% inspection is 100% effective— this seems far from 
industrial experience. Another badly needed set of operational studies would be 
on the influence of the sampling plan on inspector’s inspection. On page 2.27, 
the footnote suggests that when a tabular procedure is used instead of 
a graphic one, that more decimal places should be kcpt~the logic of tliis is not 
clear. On page 4.14 it is stated that “similarly, if all patches had tested 400 
minutes, the experiment would have terminated at 9.4 . , Clearly no such 
experiment can terminate after a fractional number of tests. On page A.09 
it is stated that “Finally it should be mentioned that tmneation of any kind 
ought generally to be avoided”. Tins seems to the reviewer to be a rash state- 
ment, for when not only average sample size but all other properties entering into 
the practical efficiency of a sampling plan are considered, this decision will almost 
certainly be reversed. The relatively small number of those detailed points is 
an evidence of careful and competent workmanship. 

A footnote to the Appendix (B) on some principles of sequential analysis states: 
“Any mathematician who may stray into this Appendix should be aSvSurcd that 
the vahdity of the conclusions in no case depends upon the type of reasoning 
presented here; indeed, even for intuitive or heuristic arguments mathematicians 
may prefer those given in SRG 75”. This warning and caveat seems unduly 
strong—the appendix is recommended to all mathematically minded newcomers 
to sequential analysis. 

The same appendix warns the reader in a few places that the theory set forth 
does not allow for the fact that samples come in units, If the reader tries to 
apply the theory to cases far from normal mspection practice, for example with 
risks of 0.25 and average sample sizes of 12, he will then find out that this does 
occasionally make a difference. In conventional circumstances the approxima- 
tion will not bother him. 



NEWS AND NOTICES 

Readers are invited to submU to the Secretary of the Institute news items of interest 

Personal Items 

Mr. Kurt W . Back has accepted a position with the Research Center for Group 
Dynamics, Massachusetts Institute of Technology. 

Mr. Stanley D Canter was discharged from the Army in October and has been 
enrolled as a graduate student in matliematical statistics at Columbia University. 

Mr. William W. Cooper has accepted a position at Carnegie Institute of Tech- 
nology, Pittsburgh 

Mr. Robert Dorfman is em’olled as a graduate student in the Department of 
Economics, University of California, Berkeley, and is also serving as a teaching 
assistant in that department. 

Dr Nicholas Eattu, formerly at Michigan State College, has accepted a teach- 
ing position at Indiana University, Bloomington. 

Mr. John P Gill is now Chief of the Research and Progress Analysis Division, 
War Assets Administration, Houston Regional Office, Texas. 

Dr. Claushi D. Hadley has accepted a position with the Graduate School of 
Business, Stanford University. 

Mr. Malcolm H. Henry is now Assistant Statistician in the Statistical Depart- 
ment of the Michigan State Department of Social Welfare, Lansing. 

Dr. Alston S Householder has accepted a position as Principal Physicist with 
the Monsanto Chemical Company, Clmton Laboratories, Oak Ridge, Tennessee. 

Mr, Morton Kramer is now with the Office of International Health Relations, 
U. S. Public Health Service, Washington. 

Mr. E, C, Leone, who was discharged from the military service in the fall, has 
returned to his former posisiton in the Department of Mathematics at Purdue 
University, Lafayette, Indiana. 

Mr. Philip J McCarthy, formerly at Princeton University, is now at Cornell 
University, Ithaca, New York 

Mr Edward C. MoHna has been named special lecturer in Mathematics at 
Newark College of Engineering, in addition to Dr. Emil J. Gumbel, previously 
mentioned. 

Mr, Nicholas Pastore has accepted a position in the Department of Mathe- 
matics, City College of New York. 

Dr. William S. Robinson is now Assistant Professor of Sociology and Statistics, 
University of California at Los Angeles, 

Dr. Leonard J. Savage, who has a Special Rockefeller Ecllowship, is spending 
the academic year at the Institute of Radiobiology and Biophysics, University of 
Chicago 
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Professor Dunham Jackson died at Minneapolis on November 6, 1946. From 
1919 until 1946 Mr Jackson was Professor of Malhcmatics at the University of 
Mmnesota, and in 1946 was named Professor Emeritus. 

Professor Charles C. Wagner died suddenly on May 23, 1946. at the age of 62. 
He was acting dean of the College of Liberal Arts of Pennsylvania State College 
when he died. 

Those interested in the work of the Mathematical Tables Project, will, upon 
request, be placed on the mailing list for copies of the monthly progress reports, 
issued by the Project. Requests should bo addressed to Dr. Arnold N. Lowan, 
150 Nassau St., New York, N. Y. 


Statistical Research Laboratory, University of Michigan 

Several developments in instruction and research in the general field of sta- 
tistics are in progress at the University of Michigan. 

At the beginning of the current academic year the new Statistical Research 
Laboratory was opened It is planned that this unit, wliich is a division of the 
Graduate School, will serve as the center for research employing statistical me- 
thods and for research in statistical methodology. Free consultation and advice 
on statistical matters are offered to all members of tlie University engaged in 
research and the latest typos of computing machines are available for their use at 
no cost to them. Or the Laboratory will undertake, at fees to cover costs, com- 
puting and the analysis of data for such individuals or units of the University. 
The Laboratory will have available the services of the University’s completely 
equipped Sorting and Tabulating Station and expects to continue to provide a 
center for the most efficient computing service as improved machines are de- 
veloped. The technical assistants employed by the Laboratory will be advanced 
students of statistics who will thus have the opportunity to supplement their 
training with experience with actual statistical investigations. Professor 
G. C. Craig as Director and Professor P. S. Dwyer arc in charge of the new labora- 
tory, each on a half-time basis. 

The new Laboratory is a research and not a teaching unit and is distinct from 
the large statistical laboratories for the use of students in statistics courses already 
in existence on the campus. With respect to instruction in theoretical statistics, 
the cuiriculum in that subject in the Mathematics Department has recently been 
revised and extended to include twenty-four semester hours at the undergraduate 
and graduate levels in addition to courses in probability, finite differences, 
graphical methods, and quality control. The somewhat related professional 
program in actuarial mathematics has likewise been strengthened. The teaching 
staff for these two curricula includes Professors H. C. Carver, A. H. Copeland, 
C. C. Craig, P. 8. Dwyer, C. H Fischer, and C. J. Nesbitt. 

A number of postwar research programs whose pursuit involves the use of 
probability and statistical methods 'have been established at the University of 
Michigan. Of especial interest is the new Survey Research Center under the 
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leadcrslup of Professors R. Likert and A. A. Camiibell who will continue activities 
begun by their group in Washington in the Department of Agriculture. Re- 
search by survey methods in the social sciences for public and private agencies 
and in survey methods ihemselves will be pursued and in addition a training 
program combining formal cour!3es and apprenticeship in the Center is being 
set up. 


New Members 

The following persons have been elected to mcmbeislni) in the Instituto' 

Albert, George E., Ph.D. (Wisconsin) Head, Mathematics Division, Rosearch Dept., Naval 
Ordnance Plant, Indianapolis, Ind , II 04 N Oakland Ave 

Ament, Richard P., B.A. (Cornell) Scientific Aid, S1S9 SOlh Si., N , Arlington, Va 

Bennett, Myra S., (Mrs. C. A.). A B. (Michigan) Office Mgr., Institute of Math Stat , 
linckham Bldg., Ann Arbor, Mich , P. O. Box Sahne. 

Blankmeyer, Edith., A.B (Western College) Stat., Res. Dept., National Broadcasting Co., 
30 Rockefeller Plaza, New York 20, N. Y. 

Blyth, Colin, Jr., M A. (Queen’s Umv. and Univ. of Toronto) Graduate student, Univ. of 
N. Car , Chapel Hill, N Car., S09 Mangum Dorimlory 

Brown, Philip, B.S (Pittsburgh) Stat., R 320 Standard Oil Bldg., 3rd and Constitution 
Aves , Washington, D. C. 

Bruno, O. P., B M H (Now Yoik Univ.) Chief, Methods Section, Ballistic Reseai oh Labs., 
Aberdeen Proving Ground, Md 

Carrier, Norman H., M.A. (Cantab) Civil Servant, Mathematical Statistics Section of 
Chief Scientific Advisers Division, Ministry of Works, c/o Weslrmnster Bank, Palmers’ 
Green, N. IS, London, England. 

Chand, Uttam, M.A. (Punjab Univ., India) Graduate student, Umv. of N. Car., Chapel 
Hill, N. Gar., US Mangum Dormilory. 

Crow, EdwinL., Ph.D. (Wisconsin) Mathematician, Science Dept., Res., Devcl., and Test 
Organization, USNOTS, Inyokern, Calif 

Dang, Mary., M A. (California) Graduate student, Columbia University, New York 27, 
N. Y , Box ^67, Johnson Hall. 

Ens, Catherine C., B.S. (Dayton) Stat Res Ass’t, Graduate School, Ohio State Univer- 
sity, Columbus, Ohio, S67 Fifteenth Ave , Columbus 10. 

For, William H., Ph D. (Indiana) Ass’t Prof, of Educ and Ass’t Director of Res and 
Field Service, Indiana Univ., Bloomington, Ind , 7SSi E. Hunter. 

Geisler, Murray A., M.A. (Columbia) Operations Analyst, Headquarters Army Air 
Forces, SSS N. Piedmont St., Arlington, Va. 

Gershenson, Charles P., B.B, A, (C C.N Y.) Res Assoc., Institute of Psychological Res,, 
Box 130, Teachers College, New York 27, N. Y. 

Gilford, Leon, AB (Brooklyn) Eoon. Analyst, Census Bureau, Washington, D. C., 1410 
mh St., S. E. 

Goudsmit, S., A., Ph.D. (Leyden) Prof, of Physics, Northwestern Univ., Evanston, 111. 

Halperin, Max, M.S. (Iowa) Graduate student, Univ. of N. Car,, Chapel Hill, N. Car , 211 
No. Columbia 

Halperin, Sidney L., Ph.D. (Ohio State) Psychologist, Neuropsyohiatric Instituto, Univ. of 
Mich. Hospital, Aim Arbor, Mich., 2401 Pittsfield Blvd., Pittsfield Village. 

Herbach, Leon H., A.B. (Brooklyn) Sub. Instr , Dept, of Math , Brooklyn Coll., N. Y., 
1926 84 th SL, Brooklyn 4- 

Hoeffding, Wassily, Ph D (Berlin) 161 West 88 St., New York 24, N Y 

Huhndorff, Roland F., B.S. (St. Mary’s Univ.) Ass’t to Asa’t Chief Chemist, The Texas 
Co., Rea. Lab., Poit Arthur, Texas. 
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Tames William c., A B (Knox Coll ) Director, Stat Div., National Safety Council, 20 

N’waokerDr., Chicago 6, 111,, 7SS5 So Dobson Adc., C/ncapoTfl ^ „ 

Lev Joseph, PhD (Cornell) Ass't Civil Service Examiner, N. 1 O, Civil Service 
Comm , and Lecturer, Teacheis College, Columbia Univ., N. A'. , S6eo F orcsl Parkway, 
Woodhaven SI. 

Linder, Arthur, Ph D (Bcin) Prof of applied math, stat,. University of Geneva, Switzer- 
land, Avenue de Champel 

Lord, FredericM., M A. (Minnesota) Ass’t Director, Graduate Record Examination, 437 
West 59tli St . New York 19, N Y , IBS W. 8Srd Si. 

Marshall, Herbert, B,A (Toronto) Dominion Stat , Dominion Bureau of Statistics, 
Ottawa, Canada 

Meacham, Alan D., Supv , Sorting and Tabulating Station, and I,ecturer, School of Bus. 

Adm,, Univ of Mich , Ann Arbor, Mich., tl4 Rackham Bldg 
Miller, Irving, B.S. (C.C.N Y.) Stat., Bureau of Labor Stat., Washington, D. C., 1900 
Bilimore St., N W.. Washington S 

Nanda, D., N., M A (Agra, India) Graduate student, Univ of N. Car , Chapel Hill, N. 
Car , Dept, of SialisHcs. 

Pmes, Sylvia F., MA (Michigan) Insti , Math, and Stat., 43-17 48lh Si., Long Island 
City 4, N. Y. _ . 

Quastler, Henry M.D (Vienna) Aledical Radiologist, Carle Hospital Clinic, Urbana, 111,, 
eW W Nevada 

Reiersol, Olav, Ph D. (Stockholm) Teacher of stot., Univ. of Oslo, Oslo, Norway, Interna- 
tional House, 500 Riverside Dr., New York $7, N Y. 

Romanovsky, Vsevolod I., Ph D. (Moscow) Prof, at the Univ and Member of the Academy 
of Sciences, Tashkend, U S. S. R 

Rust, Charles H., S.J , M A (St Louis) Graduate student, St. Louis Univ., St. Louis, 
Mo , W N. Orand Blvd., St. Louis S. 

Seal, Hilary L., B.So. (Univ. Coll .London) Head of Stat, Branch, Room 2, Old Bldg., G., 
Admiralty, Whitehall, London, S. W. 1, England. 

Serbein, Oscar N., Jr., M S. (Iowa) Graduate student, Columbia Univ,, Now York 27, 
N Y., Army Hall, Rm SS3H, 1560 Amsterdam Ave,, New York SI. 

Shell, D., A, B.So (London) Stat. in Math Stat Section of Chief Scicntiric Advisor’s 
Div , Ministry of Works, 81 Lynmoulh Ave., Bush Hill Park, Enfield, Middlesex, 
England. 

Siegel, Irvmg H., M.A. (Now A''ork) Chief, Economics Div., Veterans Adm. , Washington, 
D, C , 5407 9ih St , N. W., Washington 11. 

Sitgreaves, Rosedith, M,A, (Geo Washington) Ass't Stat., U. S. Public Health Service 
(on leave), Giaduate student, Columbia Umv., New York 27, N. Y., Johnson Hall, 

iuw imhsi. 

Tama, Joseph, B A (Washington) Ptc. U. S. Army, 5260 TIC; GHQ APPAC; APO 600, 
o/o Postmaster, San Francisco, Calif 

Tate, Merle W., Ed.M. (I-Iaivard). M.A. (Montana) Assoc. Prof, of Eduo., Hamilton 
Coll , Clinton, N, Y 

Thrall, Robert M., Ph.D, (Illinois) Ass’t Prof, of Math , Univ. of Mich,, Ann Arbor, 
Mich , 953 Spring St 

Vaughn, Kenneth W., Ph.D (Iowa) Director, Graduate Reooul Examination OfPico of the 
Carnegie Foundation tor the Advancement of Teaching, and. Assoc. Director of Co- 

operativeTcstServiccof Amer,OouttcilonEduc.,4S7Wcsl5fli;i.,Sl , Nfliu YorklO, N. Y. 

Wallace, Clifford A., Sup’t of Quality, Camera Works, Eastman Kodak Co,, 333 Slate St., 
Rochester, NY 

Wilkins, J , Ernest, Jr., Ph D. (Chicago) Mathematician, American Optical Co., S. 1. D., 
Box A, Buffalo 15, N, Y. 

-Wilkinson, Roger I., B S E.E. (Iowa State) Member Technical Staff, Bell Telephone Labs., 
463 West St., New York, NY. > r- , 



REPORT ON THE BOSTON MEETING OF THE INSTITUTE 


The twenty-fourth meeting of the Institute of Mathematical Statistics was held 
at the Hotel Statler, Boston, Massachusetts, on Saturday, December 2S, 1946. 
The meeting was held m conjunction with the One Plundred Thirteenth Annual 
Meeting of the American Association for the Advancement of Science The 
following 45 members of the Institute attended the meeting: 

K. J, Arnold, M. S Bartlett, W D. Baten, C I Bliss, G W Brier, G W. Brown, T. H- 
Brown, B 11. Camp, G W. Churchman, W G. Cochran, J. H Curtiaa, D. B. DoLury, P. V- 
Dorweiler, Churchill Eisonhart, Benjamin Epstein, II A. Freeman, Hilda Geiringer, II H. 
Gerraond, J. A Greenwood, Boyd Haiahbarger, W, A. Hendricks, E. H. C Hildebrandt, 
\V C Jacob, H B. Kaitz, L, F, Knudsen, Walter Leighton, A J, Lotka, J W Mauchly, 
Margaret Merrell, E. B Mode, Frederick Mosteller, C, M. Mottley, Dons Newman, R. H. 
Noel, H. W. Norton, Otis Pope, C. J Rees, C F Roos, P. J. Rulon, J. W. Tukey, W. M. 
Upholl, F M. Wadley, C. L. Weavei, C, P. Winsor, W. J Youdeii. 

At the morning session, a joint session with the Biometrics Section of the 
American Statistical Association, the following program was presented with 
Professor E. B Wilson of Harvard University as chairman : 

Topic The Analysis of Variance tn Biology 

Papers’ The Assumptions Underlying the Analysis of Variance 

Professor Churchill Eisonhart, University of Wisconsin and The National 
Bureau of Standards 

Some Consequences when the Assumptions are not Satisfied 
Professor W. G, Cochran, North Carolina State College 
The Uho of Transformations 

Professor M. S. Bartlett, Cambridge University and the University of 
North Carolina 

Discussion: Professor Boyd Harshbargor, Virginia Polytechnic Institute 
Dr, W. C, Jacob, Long Island Vegetable Research Farm 
Professor C. P Winsor, Johns Hopkins Univeisity 
Dr W J. Youden, Boyce Thompson Institute 

The p:rogram for the afternoon session, also a joint session with the Biometrics 
Section, under the chairmanship of Dr. E. J. DeBeer, Wellcome Research 
Laboratories, was as follows: 

Topic : The Analysis of Variance m Biology {continued) 

Papers: The Analysis of Covariance 

' Profosaoi D . B. Delmry, Virginia Polytechnic Institui.o 
Discriminant Functions 

Professor George W. Brown, Iowa Slate College 

Discussion; Professor W, D. Baten, Michigan State College 
, .Professor C. I. Bliss, Yale University 

Mr. W. A Hendrioks. U. S Department of Agriculture 
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P. S Dwyer, 
Secretary. 



ANNUAL REPORT OF THE PRESIDENT OF THE INSTITUTE FOR 1946 

New Oppobtunitibs 

The return to peacetime conditions presents the Institute with iiew^ oppor- 
tunities for expanding its activities and usefulness. An increased appreciation 
for mathematical statistics has followed the many contributions made by our 
members to the war effort. The numerous societies interested in specific appli- 
cations of statistics have come to look to the Institute both for leadership in 
theory and for playing its part in the dissemination of new results. As a result of 
the di'astic interiuption in the normal training of students during the war, there 
IS unusually keen competition for the services of capable statisticians. Those of 
our members who are engaged in teachmg are responsible for the execution of a 
vigorous training program to meet current and future demands promptly and 
without sacrifice of quality. In short, we are in a position, as never before, to 
advance the development and efficient use of mathematical statistics. The fol- 
lowing account of some of our activities during the year will indicate, I believe, 
that the record is creditable. Yet in many instances what has been accomplished 
IS only a beginning. 


Meetings 

The Development Committee has repeatedly stressed the desirability of an 
extension in our customary schedule of meetings in order to provide additional 
contacts between mathematical statisticians and the users of statistics. Owing 
to the greater availability of railway and hotel accommodation in 1940, we ob- 
tained our first opportunity to put this extension into effect, The regular winter 
meeting with the American Statistical Association and other social science or- 
gamzations was resumed at Cleveland in January, while the late summer meeting 
with the mathematicians took place at Cornell in September. In addition, two 
meetings were held with different sections of the American Association for the 
Advancement of Science, at St Louis in March and at Boston in December, 
On both occasions the programs were expository and attracted large audiences. 
Finally , at the invitation of Princeton University, a one-day meeting at Princeton 
in November was devoted to the analysis of variance. While no joint sessions 
were conducted with engineering or mdustrial societies, several of our members 
took prominent parts in the programs of such societies. 

For the near future, it seems desirable to continue the practice of meeting in 
the winter with the ASA and social science groups and in the summer with the 
mathematical groups. In 1947 these meetings will be at Atlantic City, Januaiy 
2^27 and at Yale, September 1-5 respectively, It is not known whether con- 
ditions in future years will produce a return to Christmas rather than January 
meetings, for the present the hotel situation swings the balance in favor of 
January. 
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In 1946 the membership of the program committee was enlarged so that it 
would be better equipped to arrange joint meetings with other societies. We 
owe our thanks to the members for thoir successful efforts in the face of difficulties 
which still attend the planning of a meeting. 

Annals 

Despite the scarcity of manuscripts in the later stages of the war, our editor, 
Professor S S. Wilks, succeeded throughout in maintaining the annual volumes 
of the Annals at their usual size. During 1646, scarcity gave way to plenty. 
The number of papers of good quality submitted m recent months is sufficiently 
great that there will be more than enough, by current estimates, to fill the 1947 
volume. To narrow the scope of the Annals or to reject good papers would be 
undesirable. Accordingly, the Directors have authorized an increase of 100 
pages in the 1947 volume if this is necessary to insure the publication of all ac- 
ceptable papers. 

A gratifying testimony to the prominence of the Annals in its field is the marked 
increase in the demand for back numbers. Our Secretary-Treasurer reports 
that sales amounted to $3,235. To meet actual or anticipated orders, eleven 
issues wore reprinted during 1946 at a cost of $2,809. 

For moat members of the Institute, even those v'ho serve on the Board, work 
on Institute affairs occupies only a minor portion of our time The editor is 
never free from some forthcoming pubheation deadline. Initial perusal of manu- 
scripts, selection of referees, editorial decisions, handling of the production phases 
of publication and much miscellaneous correspondence (not all of it pleasant) 
make editorial work a daily preoccupation, year in and year out. An annual 
word of thanks is an inadequate expression of our indebtedness to Professor 
Wilks. 


Membership and Finance 

At the beginning of 1945 there were 606 members. A year later this figure 
had increased to 777 and at the end of 1946 the figure stood at 900. A fifty per- 
cent increase in two years is another evidence of the healthy growth of the Insti- 
tute. It has been attained to a considerable extent through the hard work of 
our Secretary-Treasurer, P. S. Dwyer and the cooperation of individual members. 

The Secretary-Treasurer also reports a very satisfactory net gain in assets 
of $2,627 during the year. Nevertheless, financial problems may arise in the 
near future. Printing and other costs have risen sharply, and the printing of an 
enlarged Annals will be an additional drain on our resources. Both the Member- 
ship and Development Committees have given some thought to the need for 
additional revenue that may face us soon. They have recommended considera- 
tion of the possibility of Institutional Memberships, a device that has been found 
satisfactory by some other societies. A continued growth in membership will 
also help greatly to finance expanded activities. 
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Committees 

Inter-society affairs. The report of the 1944 Committee on Development, stress- 
ing the need for closer cooperation amongst the %'arious societies interested in 
statistics, provided the stimulus for active efforts in this direction , A meeting 
of representatives of these societies was called early in 1945 at the invitation of 
the American Statistical Association, This meeting suggested that a reconsti- 
tution of the ASA might enable it to become the central binding organization. 
Accordingly, a committee of the ASA has worked for a consideralile time on a 
revision of the ASA constitution, which it is intended to submit to the votes of 
ASA members early in 1947. The new constitution provides for representation 
from other societies on the Council of the ASA, should these societies decide to 
associate or affiliate with the ASA. 

From our own point of. view, it has seemed wise to delay action on cerl.ain 
internal affairs while awaiting the outcome of these developments in the ASA. 
Thus a statement of policy with regard to the formation of chapters of the IMS 
is needed and the problem has been considered both by a special committee in 
1946 and by the Development Committee in 1946. The latter committee ronom- 
mends that no decision be made pending examination of the provisions for joint 
sponsorship of local and regional chapters in the new ASA constitution Simi- 
larly, our own Committee on Revising the Constitution and By-Tjaws has de- 
ferred a final report until the attitude of our raembei’s towards (ho new develop- 
ments can be expressed. It is to be hoped that decisions can he taken in 1947. 

Tabulation • The advances made in recent years in the construction of new types 
of computing equipment justified an enlargement of our Committee on Tabula- 
tion, wMch now includes experts both on the building of machine.s and on the 
calculation and use of tables. The committee plans to keep our members in- 
formed of progress in this field. 

Government Service: Dr. W. Edwards Deming served as chairman of a new com- 
mittee on Mathematical Statistics and Statisticians in the Government Service. 
Although the federal, government employs many mathematical statisticians, 
explicit recognition of the profession is lacking in many instances. As has hap- 
pened in other fields, statisticians are sometimes officially claased as economists 
and little provision is made for mathematical statisticians in recruitment policies. 
Moreover, it is probable that a number of branches of the government, at present 
unaware of the functions of a statistician, could employ several with profit. 
The new committee will endeavor to insure that mathematical statlRtics is recog- 
nized and effectively utilized in the federal service. 

Assistance to libraries. Like other profesiiional societies, the Institute lias re- 
ceived a number of appeals from libraries in war areas whose periodicals wore 
looted or destroyed during the war. After careful consideration, the Board 
decided that official action should he limited to the free provision of missing 
copies of the Annals to all former subscribers who intend to renew subscriptions 
for the future. In addition, a committee with Professor J. Neyman as chairman 
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was appointed to establish a procedure by which gifts of individual members 
(books, reprints, back numbers of the Annals or cash for the purchase of back 
numbers) could be handled. At the suggestion of this committee a general 
appeal for the small sum of 50 cents per member was circulated with the Decem- 
ber billing. Individual collections are also being made at certain centers. 

Teaching: The Committee on Teaching has not made as much progress as it 
would have liked, owing to the dispersal of its members and the taking up of new 
civihan posts. Members have, however, cooperated with the Committee on 
Applied Mathematical Statistics of the National Research Council, which is 
engaged on a somewhat similar survey, 

Rietz lecture • The first lecturer in the new senes of lectures in honor of the late 
Henry Lewis Rietz will be Professor A. Wald. His topic will be “Sequential 
Estimation and Multi-Decisions”. The lecture will be delivered in connection 
with the Yale meetings, September 1947. 

Representatives; In addition to its committee work, the Institute cooperates, 
through representatives, with the Division of Physical Sciences of the National 
Research Council, the Joint Committee for the Development of Statistical Ap- 
plications m Engmeciing and Manufacturing, the American Association for the 
Advancement of Science, the Inter-Society Committee on Federation and the 
Policy Committee for hlathematics. The last committee, which was appointed 
m 1946, will consider important problems that affect the mathematics profession 
as a whole. 

Nominations' The Committee on Nominations, consisting of Professor P. R. 
Rider (chairman), Professor B, H. Camp and Professor G. M. Cox, has made the 
following nominations for officers in 1947. 

President , W. Pollci 
Vioo-Prosidonts .1. 11 Curtiss 
M. H. Hansen 

Secretary-Trcasiiier P S Dwyer 

While it is perhaps improper to comment on nominations, I should like to 
express my personal appreciation of Professor Dwyer’s action m being willing 
to offer himself for re-nomination as Secretary-Treasurer. The successful opera- 
tion of the Institute rests mainly on the Secretary-Treasurer, and the demands 
of the Office arc even more continuous and exacting than those on the editor. 
Professor Dwyer’s splendid work during his first throe years of office, earned on 
at considerable sacrifice of his research interests, deserves the best thanks and 
appreciation of every member. 

In conclusion, it is a pleasure to express my sincoiost thanks to all committee 
chairmen and members and to all representatives for their excellent work for the 
good of the Institute, and to all Institute members for their loyal support. 

W. G. COOITRAN, 
President, 1946. 
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Committees of the Institute 

CommitUe Personnel 

Development E. G. Olds (chairman), C. I. Bliss, M. A. 

Girshick, F. C. Mosteller, P. S. Olmstead, 
H. Scheff^. 


Membership W. Feller (chairman), C. C. Craig, P. A. 

Horst, T. Koopmans 

Program J- H. Curtiss (chairman), M. Friedman, B. 

Harshbarger, W. N. Hurvvitz, A. M. Mood, 
F. C. Mosteller, J. W. Tukey 


Mathematical Statistics and W. E. Demmg (chairman) 
Statisticians in the Govern- 
ment Service 


Revising the 
and By-Laws 


Constitution- M. H. Hansen (chairman), C. I. Bliss, A. T. 
Craig, J. H. Cuid-iss, W. Shewhart 


Tabulation C. Eisenhart (chairman), P. S. Dwyer, H. 

Goldstine, A. N. Lowan, II. W. Norton, G. R. 
Stibitz 


Teaching H. Hotelling (chairman), W. Bartky, W. E. 

Deming, M. Friedman 

Nominations P. R. Rider (chairman), B. H. Camp, G. M. 

Cox 


Finance 


P. S. Dwyer (chairman), L. A. ICnowler, C. P, 
Roos, F. F. Stephan 


Subscription to Purchase An- J, Neyman (chairman), W. Feller, P. L. Hsu 
nals for Countries Devas- 
tated by War 

Society Bepresenlalives 

Inter-Society Committee on J. H. Curtiss, P. S. Olmstead 
Federation 


Pohcy Committee for Mathe- W. Feller 
matics 
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Social]/ Representatives 

Joint Committee for the De- F. C. Mosteller, S- S. Wilks 
velopment of Statistical 
Applications m Engineer- 
ing and Manufacturing 

American Association for the G. W. Snedecor 
Advancement of Science 

Division of Physical Sciences, W. Bartky 
NRC 



REPORT OF THE SECRETARY-TREAStTRER OF 
THE INSTITUTE FOR 1946 

The Institute of Mathematical Statistica held five meetings during 1946, at 
Cleveland on January 24r-27, at St. Louis on March 30, at Ithaca on August 
22-23, at Princteon on November 1, and at Boston on Deceml)cr 28. 

The large number of meetings has necessitated frequent mailing.s to the 
membership. Memoranda to membors, with appropriate enclosures, were sent 
out in January, March, June, July, October, and November. 

The Secretary-Treasurer wishes to acknowledge the cooperation of the mem- 
bers of the Institute in paying bills promptly, m considerable activity leading to 
an increase in membership, and in general looking after the interests of the In- 
stitute. 

At the beginning of 1946 the Institute had 777 members. JDuring the year 
180 new membeis joined the Institute, an increase of 23% Hoivever, during 
1946 the Institute lost 57 members. Of these, 15 resigned, 37 were dropped for 
non-payment of dues, and 5 are deceased. Some of the 37 dropped we have 
been unable to contact, and it is very probable that, in some cases, membership 
wilt bo resumed in the future. The net increase in members during the year was 
123, or about 16%, making a total of 900 members. 

The following members died during the year: 

Professor 0. P. Banos 
Professor S. A. Cudmoro 
Professor Dunham Jackson 
Dr. Walter P Schilling 
Professor C. 0. Wagner 

The office of the Secretary-Treasurer sent a reprint of an A nnale article and 
information about the Institute to 1800 persons interested in Quality Control. 
At least 28 of the new members became members as a result of tliis drive. As a 
continuation of a campaign started in 1946, the Institute also sent literature 
about the Annals to several hundred libraries and laboratories 

The Secretary-Treasurer wishes to acknowledge the continued assistance of 
Professor Lloyd Knowler in caring for the back issues of the Annals which are 
stored at Iowa City. 

A few comments about the financial statement which appears below are in 
order In addition to the increase in membership, mentioned above, the chief 
rise in income resulted from the unprecedented sales in back issues which 
amounted to $3,234.88, an increase over the preceding year (the previous high) 
of 86%. These heavy sales, however, depleted the supplies of many of our early 
issues, so that we were forced to reprint eleven of those issues and also the cumula- 
tive index during the year. This cost $2,809.00 (for 500 copies of each) and in- 
dicates that a much larger portion of our assets is in inventoiy, as shown in Ex- 
hibit D. 
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Following the instructions of the Finance Committee, Professor H C. Carver 
was paid for his share of all issues m which he and the Institute had joint owner- 
ship. 

Nine members have paid life memberships during the year, increasing the 
total of life membership funds by $812.50. 

The net gain in assets of $2,627.23 is very satisfactory even though this gain 
is evident in increased inventory and not in a better cash position. 


FINANCIAL STATEMENT 
December 31, 1046, to December 31, 1046 
A Receipts 

Balance ON Hand, Decembeh 31, 1045. ... ... 17,548 22 

Duns ... . . . . . . , , 4,638 40 

Life Membership Payments . . . . . . 812 SO 

Subscriptions. , ...... . 2,067.54 

Sams OP Back Numbers. .... .... ... . 3,234.88 

Income from Investments , . , . . . . 150 00 

Miscellaneous . . ..... 121.29 


Total . $18,.562.83 

B. Expenditures 

Annals— Current 

Office of Editor . $126 00 

Waverly Pi ess 4,566 27 


$4,601.27 

Annals — Back Numbers 

PurchaBe from II. C. Carver . .. . 644.50 

Reprinted 600 copies .. . 2,809.00 

Vol. I #1, IT #2, II #3, HI #3, IV 1>!1, VII VII § 2 , VIII 
jffl, 2, 3, 4, Cumulative Index 

Iowa City Office . . . . 41.46 

Binding. . . 68 00 


3,662.96 

Office of President . . . . 25.62 

Mathematical Review.s. . . 100 00 

Office op the kSecretary-Trbasurbr 

Printing, Mimeographing, programs, etc (including stamped 

envelopes) . $967.14 

Printing 1800 copies of Wald-WolfowiU article 140.00 

PoBtago and supplica ' 376.00 

Clerical bclp. . . 1,420 26 


2,902.39 

MiBCELLANBOirs 39 . 04 

Balance ON Hand, December 31, 1940 (Cash and Bonds) . 7,241.65 


$18,562.83 
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REPORT OF THE SECRETARY-TIiBASUREtl 


C. SuMJvTARr OP Receipts axd Expenuitfres 

Balance ON Hand,* Decemdeh 31, 1945 $7,5-18,22 

Receipts nuiUNG 1946 , , . . • 11,014,61 

Expenditures DuniNO 1946 , .,, , , . 11,321 28 

Balance ON Hand,* December 31, 1940 . ,, .. 7,241 55 


D, Comparison op Assets on December 31, 1946 and Dec’emrer 31, 19-10 


US Government G Bonds 

Life Membeiship Funds, , , , 

Additional Bank Deposits 

Current Accounts Receivable , . 

Estimated Value (Coat) of back issues of Annals 


i9iB me 

$0,000,00 $5,000.00 
I «.S8,00 ( 1888.00 Bon.ls 

\ 327.00 \ 139.50 Hank Dep. 

333.22 214.05 

255 35 452.62 

4,497 05 7,234 58** 


Total. $12,301.52 14,928.75 

Net Gain 1940 2,627.23 

E Liabilities op Institute op Mathematical Statistics as of December 31, 1946 

All bills which have been presented have been paid and there are no outstanding accounts 
against the Institute. The $2027 50 in Life Membership payments require the Institute to 
provide, the privileges of membership for life for the 20 members who have made payments. 
Also, $2686.71 should be credited to 1947 dues and subscriptions . 


December 31, 1940 


PAtnj S. Dwyer 

Secretary-Treasurer, 


* In. form of bank deposit and government bonds. 

** Value of Annals calculated at 67 cents per copy, and based on physical inventoryj 



ANNUAL REPORT OF THE EDITOR FOR 1946 

Dui'iiig 1946 there was a considerable increase m the number of manuscripts 
submitted to the Editorial Committee of the Annals. A total of 49 papers in- 
cluding 18 short notes were pubhshed in the 1946 volume of the Annals. The 
publication of these papers together with various official reports of the Institute 
and the Directory of the Institute required a total of 555 pages. Plans are al- 
ready under way to expand the 1947 volume of the Annals to 600 pages. 

During recent years there has been a very noticeable broadening of interest 
in the field of probability and statistical theory on the part of readers and con- 
tributors to the Annals Contributors to the 1946 volume came from university 
departments of astronomy, biology, mathematics, sociology and statistics, from 
Army, Navy and other government groups; and from industrial laboratories and 
quality control departments. More recently, contributions have been received 
from physicists, chemists and other groups. More contributions are now being 
received from overseas than in previous years. Every effort is being made to 
keep the Annals balanced with respect to these various directions of interest in 
probability and statistical problems. It is believed that one of the most effec- 
tive things which could be done for the readers of the Annals is to publish ex- 
pository articles from time to time on new fields of development in probability 
and statistical theory. Invitations have been accepted by several individuals to 
prepare such articles, 

Dr. Thornton C. Fry has asked to be relieved from the Editorial Com- 
mittee, as of Januaiy 1, 1947. The Editor -wi-shos to take this opportunity to 
express his gratitude for the service which Dr. Ery has rendered m connection with 
the editorial work on the Annals dunng the past mne years. 

On behalf of the Editorial Committee for the Annals, the Editor wishes to 
acknowledge with thanks the refereeing assistance which has been provided by 
the following persons during 1946: R, L. Anderson, T. W. Anderson, David 
Blackwell, Z. W. Birnbaum, K. L. Chung, W. J. Dixon, J. L. Doob, M. A. 
Girshick, T. E. Harris, L Henkin, M Kac, Irving Kaplansky, Bradford F. IGm- 
ball, T. Koopmans, H. Levene, H. B. Mann, P J. McCarthy, F. C. Mosteller, H. 
E. Robbms, D. F. Votaw, J. E. Walsh and C. P. Winsor The Editor is also 
indebted to the following individuals at Princeton Umversity for preparation of 
manuscripts for the printer, and other editorial assistance: Mrs. Gladys B.Huling, 
Mrs. Eleanor C. Schoonly and J. E. Walsh. 

S. S. Wilks 
Editor. 

December 31, 1947 
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CONSTITUTION 

OF THE 

INSTITUTE OF MATHEMATICAL STATISTICS 

ARTICLE I 
Name and Purpore 

1, This organization shall be known as the Institute of Mathematical Statistics, 

2. Its object shall be to promote the interests of mathematical statistics, 

ARTICLE II 

Mbmbehship 

1, The membership of the Institute shall consist of MeTnljer,s, Fellows, Honorary 
Members, and Sustaining Members, 

2, Voting members of the Institute shall be (a) the Fellows, and (b) all others. Junior 
members excepted, who have been members for twenty-three months prior to the date 
of vothig, 

3, No person shall be a Junior Member of the Institute for more than a limited term as 
determined by the Committee on Membership and approved by the Board of Directors. 

ARTICLE III 

Officers, Board of DiRECTona, and Committee on Memberbiiip 

1 The Officers of the Institute shall be a President, two Vice-Presidents, and a Boore- 
tary-Treasurer. The terms of office of the President and Vico-Presidonts shall be one 
year and that of the Seoretary-Treasurer three years. Elections sliall lie by majority 
ballots at Annual Meetings of the Institute. Voting may be in person or by mail, 

(a) Exception, The first group of Officers shall bo elected by a majority vote of the 
individuals present at the organization meetmg, and shall serve until December 31, 1936, 

2 The Board of Directors of the Institute shall consist of the Officers, the two previous 
Presidents, and the Editor of the Official Journal of the Institute. 

3. The Institute shall have a Committee on Membership composed of a Chairman and 
tlnee Fellows At their first meetmg subsequent to the adoption of this Constitution, the 
Board of Directors shall elect three members as Follows to servo as the Committee on 
Membership, one member of the Committee for a term of one year, another for a term of 
two years, and another for a term of three years. Thereafter the Board of Dh-ootors shall 
elect from among the Fellows one member annually at their first meeting after their elec- 
tion for a term of three years. The president shall designate one of the Viee-Proflidonts as 
Chairman of this Committee 


ARTICLE IV 
Meetings 

1. A meeting for the presentation and discussion of papers, for the election of Officers, 
and for the transaction of other business of the Institute shall be held annually at such 
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time as tho Board of Diieetora may designate. Additional meetings may be called from 
time to time by the Board of Directors and shall be called at any time by the President 
upon written request from ten Fellows. Notice of the time and place of meeting shall be 
given to the membership by the Secretary-Treasurer at least thirty days prior to the 
date set for the meeting, AU meetings except executive sessions shall be open to the 
public. Only papers accepted liy a Program Committee appointed by the President may 
be presented to the Institute. 

2. The Board of Directors shall hold a meeting immediately after their election and 
again immediately before the expiration of their term. Other meetings of the Board 
may be held from time to time at the call of the President or any two members of the 
Board. Notice of each meeting of the Board, other than the two regular meetings, 
together with a statement of the business to be brought before the meeting, must be 
given to the members of the Board by the Secretary-Treasurer at least five days prior to 
the date set therefor. Should other business be passed upon, any member of the Board 
shall have the right to reopen the question at the next meeting. 

3. Meetings of the Committee on Membership may be held from time to time at the call 
of the Chairman or any member of the Committee provided notice of such call and the 
purpose of the meeting is given to the members of the Comnuttee by the Secretary- 
Treasurer at least five days before the date set therefor. Should other business be passed 
upon, any member of the Committee sliall have the right to reopen the question at the 
next meeting. Committee business may also be transacted by correspondenoe if that 
seems preferable 

4. At a regularly convened meeting of the Board of Directors, four members shall 
constitute a quorum. At a regularly convened meeting of the Committee on Member- 
ship, two members shall constitute a quorum. 

ARTICLE V 

POBLICAllONB 

1. The Annals of Malhematical Slalislics shall be the Official Journal for the Institute. 
The Editor of the Annals of Matheimlical Statistics shall be a Fellow appointed by the 
Board of Directors of the Institute. The term of office of the Editor may be terminated 
at the discretion of the Board of Directors. 

2. Other publications may bo originated by the Board of Directors as occasion arises. 

ARTICLE VI 
Expulsion on Suspeinsion 

1 Except for non-payment of dues, no one shall be expelled or suspended except by 
action of the Board of Directors with not more than one negative vote. 

ARTICLE VII 
Amendmbnts 

1. This constitution may bo amended by an affirmative two-thirds vote at any regu- 
larly convened meeting of the Institute provided notice of such proposed amendment 
shall have been sent to each voting member by the Secretary-Treasurer at least thirty 
days before the date of the meeting at which the proposal is to be acted upon. Voting 
may be in person or by mail. 
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ARTICLE I 

Duties of thk Ofpiceeb, the Emtok, Boaud of Diheotobs, and 
Committee on Membf.hship 

1. The President, or m his absence, one of the Vice-Presidents, or in tlio absence of the 
President and both Vice-Presidents, a Pellow selected by vote of the Pcllowa present, 
shall preside at the meetings of the Institute and of the Board of Directors. At meetings 
of the Institute, the presiding officer shall vote only m the case of a tie, but at meetings 
of the Board of Directors he may vote in all oases. At least three months before the date 
of the annual meeting, the President shall appoint a Nominating Committee of tliree 
members. It RVmll be the duty of the Nominating Committee to make nominations for 
Officers to be elected at the annual meeting and the Secretary-Treasurer shall notify all 
voting members at least thirty days before the annual meeting. Additional nomina- 
tions may be submitted in writing, if signed by at least ten Fellows of the Institute, up to 
the time of the meeting. 

2. The Secretary-Treasurer shall keep a full and accurate record of the jiroceedings 
at the meetings of the Institute and of the Board of Directors, send out calls for said 
meetings and, with the approval of the President and the Board, carry on the corre- 
spondence of the Institute. Subject to the direction of the Board, he shall luu'e charge 
of the archives and other tangible and intangible property of tho Institute and oiico a year 
he shall publish in the Anmk of Malhomalical Statistics a classified list of all Members and 
Fellows of the Institute. He shall send out calls for annual dues and ackiio wiodge receipt 
of same, pay all bills approved by tho President for expendituies authorized by the Board 
or the Institute; keep a detailed account of all receipts and expenditures, jiroparo a finan- 
cial statement at the end of each year and present an abstract of the saiue at tho annual 
meeting of the Institute after it has been audited by a Member or Follow of tho Institute 
appointed by the President as Auditor. The Auditor shall report to tho President. 

3. Subject to the direction of the Board, the Editor shall bo chargeil witli tlic responsi- 
bility for all editorial matters concerning the editing of the Annals of MalMctnalical Sla^ 
tistics. Heshall, with the advice and consent of the Board, appoint an Editorial Commit- 
tee of not less than twelve members to co-operate with him; four for a period of five years, 
four for a period of three years, and the remaining members for a period of two years, ap- 
pointments to be made annually as needed. All appointments to tho Editorial Com- 
mittee shall terminate with the appointment of a new Editor. The Editor sliall serve os 
editorial adviser in the publication of all scientific monographs and pampMots authorized 
by the Board. 

4. The Board of Directors shall have charge of the funds and of tlie affairs of the 
Institute, with tho exception of those afifairs specifically assigned to tho President or to 
the Committee on Membership. The Board shall have authority to fill all vaeaiioies 
ad interim, occurring among the Officers, Board of Directors, or in any of tho Committees, 
The Board may appoint such other committees as may be required from time to time 
to carry on the affairs of the Institute. The power of election to the different grades of 
Membership, except the grades of Member and Junior Member, shall reside in tho Board. 

5. The Committee on Membership shall prepare and make available through the 
Secretary-Treasurer an announcement indioatmg the qualifications requisite for the 



BY-LAWS 


163 


different grades of membership, Tlie Committee shall review these qualifications period- 
ically and shall make such changes in these qualifications and make such recommendations 
with reference to the numbei of grades of membership as it deems advisable. The power 
to elect worthy applicants to the grades of Member and Junior Member shall reside in the 
Committee, which may delegate this power to the Secretaiy-Treasurer, subject to such 
reservations as the Committee considers appropriate The Committee .shall make recom- 
mendations to the Board of Directors with reference to placing members in other giades 
of membership. The Committee shall give its attention to the question of increasing the 
number of applicants for membership and shall advise the Secretary-Treasurer on plans 
for that purpose. 

ARTICLE II 
Dues 

- 1. Members shall pay five dollars at the time of admission to membership and shall 
receive the full current volume of the Official Journal. Thereafter, Members shall pay 
five dollars annual dues. The annual dues of Junior Members shall be two dollars and 
fifty cents 

The annual dues of Fellows shall be five dollars. The annual dues of Sustaining 
Members shall be fifty dollars. Honorary Members shall be exempt from all dues. 

(a) Exception. In the case that two Members of the Intitute are husband and wife 
and they elect to receive between them only one copy of the Official Journal, the annual 
dues of each shall be three dollars and seventy-five cents 

(b) Exception. Any Member or Fellow may make a single payment which will be 
accepted by the Institute in place of all succeeding yearly dues and which will not other- 
wise alter his status as a Member or Fellow. The amount of this payment will depend 
upon the age of this Member or Fellow and will be based upon a suitable table and rate of 
interest, to be specified by the Board of Directors. 

(c) Exception. Any Member or Junior Member of the Institute serving, except as a 
commissioned officer, in the Armed Forces of the United States or of one of its allies, may 
upon notification to the Secretary-Treasurer be excused from the payment of dues until the 
January first following his discharge from the Service He shall have all privileges of 
membership except that he shall not receive the Official Journal. However during the 
first year of his resumed regular membership he may have the right to purchase, at $2.50 
per volume, one copy of each volume of the Official Journal published durmg the period 
of his service membership. 

2 Annual dues shall be payable on the first day of January of each year 

3, The annual dues of a Fellow, Member, or Junior Member include a subscription to 
the Official Journal. The annual dues of a Sustaining Member include two subscriptions 
to the Official Journal. 

4. It shall be the duty of the Secretary-Treasurer to notify by mail anyone whose dues 
may be six months in arrears, and to accompany such notice by a copy of this Article, 
If such person fail to pay such dues within three months from the date of mailing such 
notice, the Secretary-Treasurer shall report the delinquent one to the Board of Directors, 
by whom the person’s name may be stricken from the rolls and all privileges of member- 
ship withdrawn. Such person may, however, be re-instated by the Board of Directors 
upon payment of the arrears of dues. 
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ARTICLE III 

SAIiABIES 

1, The Institute shall not pay a salary to any Officer, Director, or member of any 
committee. 


ARTICLE IV 
Amendments 

L These By-Laws may be amended in the same manner as the Constitution or by a 
majority vote at any regularly convened meeting of the Institute, if the proposed amend- 
ment has been previously approved by the Board of Directors. 
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PROBLEMS IN PROBABILITY THEORY 
By Hahald Cbami^k 
University of Stockholm 

1» Introduction. The following survey of problems in probability theory 
has been written for the occasion of the Princeton Bicentenmal Conference on 
“The Problems of Mathematics,” Dec. 17-19, 1946. It is strictly confined to 
the purely mathematical aspects of the subject. Thus all questions concerned 
with the philosophical foundations of mathematical probability, or with its 
ever increasing fields of application, will be entirely left out 

No attempt to completeness has been made, and the choice of the problems 
considered is, of course, highly subjective. It is also necessary to point out 
explicitly that the literature of the war years has only recently — and still far 
from completely — been available in Sweden Owing to this fact, it is almost 
unavoidable that this paper will be found incomplete in many respects. 

I FUNDAMENTAL NOTIONS 

2. Probability distributions, From a purely mathematical point of view, 
probability theory may be regarded as the theory of certain classes of additive 
set functions, defined on spaces of more or less general types. The basic struc- 
ture of the theory has been set out in a clear and concise way in the well-known 
treatise by Kolmogoroff [63]. We shall begin by recalling some of the main 
definitions Note that the word additive, when used in connection with seta 
or set functions, will always refer to a finite or enumerable sequence of sets. 

' Let w denote a variable point in an entirely arbitrary space Q, and consider 
an additive class C of sets in fl, such that the whole space n itself is a member of 
C. Further, let P(S) be an additive set function, defined for all sets S belonging 
to the class C, and suppose that 

P{S) ^ 0 for all S in C, 

P{Q) = 1. 

We shall then say that P{S) is a probabtlity measure, which defines a probability 
distribution in S2 For any set S in C, the quantity P{S) is called the probability 
of the event expressed by the i elation u C. S, iie. the event that the variable 
point u takes a value belonging to jS. Accordingly we write 

P(S) = Pi<v C S) 

Suppose now that w' = g(ai) is a function of the variable point to, defined 
throughout the space U, the values w' being points of another arbitrary space 
Let S' be a set in O' and denote by P the set of all points w such that w' = 
g(u) belongs to S'. Whenever S belongs to C, we define a set function P'{S') 
by writing 

P'iS') = P(S). 
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It IS then easy to see that P'(S') is defined for all S' belonging to a certain 
additive class C in the new space Of, and that P'{S') is a probability measure 
in n', such that P'{S') signifies the probability of the event w' C S' (which is 
equivalent to to C S). We shall say that P'{S') is attached to the probability 
distribution in fi' which is induced by the given distribution in 0 and the function 
0 )' = g{u). 

3. Random variables. Consider in particular the case when w' is a real 
number f, such that ? = g(oi) is a real-valued C-measurable function of the 
argument oi. Then C includes the class Bi of all Borel sets S' of the space = 
Ri of all real numbers, and we shall call f a one-dimensional real random variable. 
The probability of the event ^ C S' is uniquely defined for any Borel set S' of 
7t!i , as soon as the function 

F(x) = P(f ^ a;) 

is known for all real .v. F(x) is called the disirihulion function (d.f.) of the 
random variable If the function ^ = g(co) is integrable over J2 with respect 
to the measure P(S) , we write 

m = f ff(o>) dP = f xdF(x), 

and denote this exirression as the expectation or mean value of the random vari- 
able $ Any real-valued P-measurable function r? = is also a random 
variable with the probability distribution induced by the original w-distribution 
and the function tj = ACffCw)) . If rj is integrable over 12 with respect to P, its 
mean value may be w'ritten in the form 

Eg = Ehik) = f HgM) dP = f h{x) dF(x). 

Ja j-m 

More generally, if oj' = (?i , • , is a point in an u-dimensional Euclidean 
space Bn , while C' includes the class B„ of all Borel sets of R„ , we are con- 
cerned with an n-dimensional real random variable. The distribution of this 
variable, which is also called the joint distribution of the n one-dimensional 
variables , ■ • ■ , , is uniquely defined, as soon as the joint d.f. 

7 ■ • • « a;n) = P(ti ^ Xn) 

is known for all real Si , • • • , a:« , 

The variables • , Sn are said to be independent, if F(xi, ■ ■ • , a;„) = ri(a:i) 

■ • FnM, where Fy(x,) is the d,f. of the variable . 

The extension to complex random variables is obvious. Suppose e.g. that 
? = fif(co) and T) = /i(to) are two onc-dimensional real variables, and consider 
the complex variable ? -j- ti; = p(w) -f- ih(ai) . By definition, we identify the 
distribution of this variable with that of the two-dimensional real variable 
(^, 7j), and we put 


+ ig) = iEg. 
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Joint distributions of several complex variables are introduced in a correspond- 
ing way 

4. Characteristic functions. If ^ is a one-dimensional i cal random variable, 
the mean value 

<p(z) = e'^^dFix) 

60 

exists for all real z, and we have 

I <p(z) 1 ^ 1, tpiO) = 1. 

ip{z) is called the characlensiic function (c.f.) of the distribution corresponding 
to the variable The reciprocal formula (Ldvy) 

1 pZ —iix —tty 

Fix) F(y) = — lim / — p(z) dz, 

Z-*ta J—Z Z 

which holds for any continuity points x and y of F, shows that there is a one- 
one correspondence between the di. Fix) and the c.f. p(z). As we shall see 
below, the c f. provides a powerful analytical tool for operations with prob- 
ability distributions. 

When a complex-valued function ^(z) of the real variable z is given, it is 
often important to be able to decide whether <piz) is or is not the o f . of some 
distribution. If we assume a priori that (p(0) = 1, each of the following condi- 
tions is necessary and sufficient for </>iz) to be a c f . 

A. (piz) should be bounded and continuous for all z, and such that the integral 

^ <piz- dz du 

Jo Jo 

is real and non-negative for all leal x and all A > 0 (Cramer [11], in simplifica- 
tion of an earlier result due to Bodmer, [4]). 

B. There should exist a sequence of functions ^i(z), i'ziz), ■ ■ such that 

r” 

ipiz) = lim / i(/nix -t- z)i/'„(a;) dx 

v— 00 

holds uniformly in every finite z-mterval (Khintchine, [45]), 

These general theorems are not always easy to apply in practice. Among 
less general results which arc more easily applicable, we mention the almost 
trivial fact that a function ip(z) which near z = 0 is of the form ipiz) = 1 -|- o(z’) 
cannot be a c.f. unless ipiz) = 1 for all z, and the two following theorems: 

1) An integral function tpiz) of order y < 1 can never be a c f. (Ldvy, [64]), and 

2) an integral function <piz) of finite order y > 2 cannot be a c.f unless the 
convergence exponent of its zeros is equal to y (Maroinlaewicz, [72]). The 
latter result shows e g. that no function of the form 6"'”^ where giz) is a poly- 
nomial of degree > 2, can be a c.f. 

It would be highly desirable to obtain further results in tliis direction. 


« 
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The c.f of the joint distribution of n real random variables , ■ ■ , is the 
function tp(zi , ■ • , Zn) defined by the relation 

•• , 3 „) = 

Most of the above results for c.f m one variable can be directly generalized 
to the multi-variable case. 

6. Random sequences and random functions. Let t be a variable point in 
an arbitrary space T, and consider the space fl, where each, point oj is a real- 
valued function to = x{t) of the variable argument t Let ii, • • ,'tn be any 
fini te set of distinct points I The set of all functions u = r(f) satisfying the 
inequalities 

dj ^ hj , (j * ‘ j'n)> 

will be called an interval m the space fl. The Borel sets in fi will be defined as 
the smallest additive class B of sets in ft contaimng all intervals 

Suppose now that, for any choice of n and the tj, the variables a;(fi), • • ■ , x{tn) 
are random vanables having a known n-dimcnsional joint distribution. If the 
family of all distributions corresponding in this way to finite sequences k , 

, in satisfies certain obvious consistency conditions, a fundamental theorem 
due to Kolraogoroff asserts that this family determines a unique probability 
distribution in the space ft of all functions xli) The corresponding probability 

P(/S) = Pix{t) C S) 

is uniquely defined for all Borel sets iS of ft. 

Consider in particular the case where T is the set of non-negative integers 
f = 0, 1, 2, . The space ft then is the space of all sequences (xo , .t:i , • • ) 

of real numbers As soon as the joint distribution of any finite number of 
variables , ■ ■ , is defined, and these distributions are mutually con- 

sistent, it then follows that there is a unique probability distribution of the 
random sequence (xo , .ti , •), the corresponding probability being defined 

for every Borel set of the space ft of sequences Similarly we may consider the 
doubly infinite sequence (• ■ • , , xo , Xi , • ). 

Consider further the more general case when T is any set of real numbers 
Then ft iB the space of all real-valued functions w = x{l) defined on the set T, 
and as before the knowledge of the distributions for all finite sets of variables 
x(fi), • ■ , x(f„) permits us to determine a probability distribution in the space 
ft of random functions x(/,), the probability P(S) = P(x(f) C S) being always 
defined for all Borel sets S in ft. 

The generalization of the above considerations to complex-valued random 
sequences and functions is immediate. 

6. Various modes of convergence. Consider a sequence Pi(x), F^ix), . . 
of d f :s. and lei fhe coivc'sponding c.f ;s be • • • . In order that P„(x) 
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converge to a d.f. F{x), in every continuity point of the latter, it is necessary 
and sufficient'^ that (pniO converge f6r evety real i to a liimt ip(i) which is con- 
tinuous at i = 0 Then <p{t) is the c f corresponding to the d f F(x). 

Further, let x and Xi , , be complex-valued random variables, such 

that the random sequence (r, ai , Xj , • •) has a well defined distribution. We 

shall be concerned with various modes of convergence of x,, to x 

A When P( | r„ — .r | > c) — > 0 as n — ^ oo , for any c > 0, we shall say that 
x„ converges Lo x in prohabihty 

B. When £? | a;„ — x j"'' — ^ 0, as a — >■ oo , where 7 > 0 is fixed, we shall say that 
Xn converges to x in the mean of order 7 . Unless otherwise stated we shall in 
the sequel always consider the case 7 = 2 , and in this case we shall use the 
notation 

1.1 m. .Vn = X. 

»— ►oo 

C. When P(lim = .c) = 1, we shall say that .r,, converges with piobability 

n— *09 

one, or converges almost ceilairily to x 

With lespect to the last definition, we may remark that the set defined by 
the relation lim Xn = x is always a Borcl set in the space of our random sequence, 
so that the probability of this relation is A\ell defined In fact, thi.s probability 
is given by the expression 

hm hm lim P M x^ — x 

jj— *90 \ 

where the limit prociess applies to a probability attached to a Borel set in a finite 
number of dimensions. The case of almost certain convergence is precisely 
the case when this expression takes the value 1 

Convergence in the mean of any positive order, as well as almost certain 
convergence, both imply convergence in probability, wliich may be written 
symbolically B A and (7 — > A Between B and C, there is no simirle relation 
of this kind Further, A and B both imply almost certain convergence for any 
partial sequence x^^ , x„j , • such that the subscripts rii increase sufficiently 

rapidly with k. 

II. PROBLEMS CONNECTED WITH THE ADDITION OF 
INDEPENDENT VARIABLES 

7. During the early development of the theory of probability, the majority 
of problems considered were connected with gambling. The gain of a player 
in a certain game may be regarded as a random variable, and his total gam in a 

’ As I have already .stated in a paper published in 1938, there is an erior in the state- 
ment of this theorem, given in my Cambridge Tract [9] Random Vat tables and Probahility 
Distnbultons . For the truth of the theoiem, it is essential that Pufi) should bo supposed 
to conveige to <pli) for every real I, However, in the particular case when the limit <p(t) 
IS analytic and regular in the vicinity of 1 = 0, it can be proved that it is sufficient to assume 
oonvergenoe in some interval | f [ < 0 . 


< - for t> = n,n + 1, 
m 


, « + 


p) 



170 


HAHALD CRAM^IH 


sequence of repetitions of the game is the sum of a number of independent 
variables, each of which repiesents the gain in a single performance of the game. 
Accordingly a great amount of work was devoted to the study of the probability 
distributions of such sums. A little later, problems of a similar type appeared 
in connection with the themy of errors of observation, when the total error was 
considered as the sum of a certain number of partial errors due to mutually 
independent causes At first only particular ca.ses were considered, but gradu- 
ally general types of problems began to arise, and in the classical work of Laplace 
several results are given concerning the general problem to study the distribution 
of a sum 

Zn — “b ‘ ■ ■ "b 

of independent variables, when the distributions of the are given. This 
problem may be regarded as the very starting point of a large number of those 
investigations by which the modern Theory of Probability was created The 
efforts to prove certain statements of Laplace, and to extend his results further 
in various directions, have largely contributed to the introduction of rigorous 
foundations of the subject, and to the development of the analytical methods. 
At the same time, more general types of problems liave developed from the 
original problem, and the number and importance of practical applications 
have been steadily increasing. 

8. Composition of distributions. Let aii and Zi be two independent variables, 
with the d f.’s Fi and Fs , and the c f.’s and , and let the sum cci + Xi have 
the di. F and the o.f. tp Then 

Fisc) = [ Fi(z — y) dFiiy) = [ Ftix — y) dFtiy). 

J—» J— to 

We shallsay that F is the composition of Fx and F ^ , and write this as a symbolical 
multiplication: 

F = Fx * F 2 = Fi! * Fx 

To this symbolical multiplication of the d.f:s corresponds a real multiplication 
of the c.f.’s: 

The operation of composition is both commutative and associative, so that 
any symbolical product F = Fx* F 2 •• *F„ is uniquely defined and independent 
of the order of the components. When at least one of the components is con- 
tinuous (absolutely continuous), the same holds for the composite, and m 
many cases it is true that the composite is at least as regular as the most regular 
of the components (Ldvy, [58], [63], etc.). However, this general statement 
does not hold generally, as is shown by an interesting example due to Raikov, 
[77], where Fx and F 2 are integral analytic functions, while the composite F = 
Fx*F 2 is not regular at the origin 

It seems to be an important unsolved problem to find convement restrictions 



PKOJSLEMS IN’ PROBABILITY THEORY 


171 


ensuring the vnhditjr of the above statements of the “smoothing effect” of 
the operation of composition. 

When F = Fi * F 2 , we may say that F is “dmsible” by each component Fi 
and F 2 , and it seems natural to try to develop a theory of symbolical factoriza- 
tion for d.f .’a. In this connection, it is important to note that symbolical dm- 
sion is not unique. In fact, Khintcliine has shown by an example that it is 
possible to find the d.f .’s F, Fi, Fn , and F 3 such that 

F = Fi * Fi = Fi * Fs , 

while Fi ^ F 3 . Another fundamental problem belonging to this order of ideas 
is to decide whether a given d.f. F is decomposable or not. F is called decom- 
posable, if there is at least one representation of the form F = Fi* Fi, where 
each component F, has more than one point of increase. So far, this problem 
has only been solved in very special cases, and the general problem still re- 
mains open for research. A particular case of some interest would be to know 
if there exists an absolutely continuous and indecomposable d f., such that 
F{a) = 0 and F{h) = 1 for some finite a and h 

As soon as we restrict ourselves to certain special classes of distributions, 
it is possible to reach results of a more definite character concerning the factori- 
zation problems, Some results of this type will be considered below. 


9. Closed families of distributions. The fact that certain families of dis- 
tributions are closed with respect to the operation of composition has played 
an important part in many applications. If Fi and Fi belong to a family of 
this character, so does the symbolical product F = Fi* Pi We first give some 
simple examples of such families. 

The normal disLribuiion. The d.f. F has the form F = (j> 

O' > 0, and 


fx — m\ . 

I 1 , where 


1 f* 

^ i. 

The c.f corresponding to F is 
mi and any positive cri , a we have 


dt. 

) 

and it follows that for any real mi 


where 


m = mi m2 , 


_ 2 I 2 

= CTl H- 0-2 


The Poisson distribution. Here the df, is = ^'(a;, X, m, a) where X > 0, 

a 5^ 0, and F ia a, step-function with a jump equal to - e""’' in the point x = 

m -f- pa, where r = 0, 1, • • - The corresponding c.f. is ^ jf, 

follows that for any fixed a we have 

Fix, Xi , mi , a) * F{x-, X 2 , m 2 , a) = Fix, Xi -f- X 2 , mi -f m 2 , a) 
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The Pearson Type HI dielnbniion. F = F(x, a, X) »= j dt, 

( tzV^ 

(x > 0) The ooiTesponding o f is 1 1 — - j , and for any fixed a > 0 and any 

positive Xi and Xj wo have 

F(x, a, Xi) * F{x, a, Xa) = F{x\ ol, Xi + Xa). 

Stable distributions We shall say that a closed family is stable, when all 
its members are of the form F(ax + b), where is a d.f., while a > 0 and b are 
constants. Obviously the normal family is an example of a stable family. It 
has been shown by L6vy and Khintcliine [49], that ad.f. F{x) generates a stable 
family when and only when the logarithm of its c.f. is of the form 


(9.1) 


log <p(z) = /3iz — y I a r ^1 + 1^ cu j , 


where a, /?, y, S are real constants such that 

0 < a ^ 2, 7 > 0, 

while 

air 


«•! ^ 1, 


ig- 


for a 1 


? log I s I 

TT 


for a = 1, 


For « = 2 we obtain the normal family. 

A more general and very important closed family is the family I of infinitely 
divisible distributions. A cl.f. F belongs to I if to every n = 1, 2, • • ■ there 
exists a d.f. G such that F = where G‘"' denotes the symbolical nth power 
of 0 Obviously the family I is a closed family which contains all the families 
mentioned above Ldvy [GO], [63], has shown that F is infinitely divisible when 
and only when the logarithm of its c.f is of the form 


log (p{s) 


(9 2) 


= 0iz - 7«' + r fe"” - 1 - 


dM(u) 


+ r - 1 - 

Ja \ 1 4 " iPj 


dN{u), 


whore ^ and 7 > 0 are real constants, while M{u) and N{u) are non-decreasing 
functions such that 

M(— co) = = 0, 

/ dM{u) < « and / dN{u) < « 

Jo 

for any finite a > 0 XXfiicn If a,nd .Y reduce to zero, we obtain the normal 
family. When 7 = 0 and one of the functions JVf and Y reduces to zero, while 
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the other is a step-function with a single jump equal to X at the point x = a, 
we obtain a Poisson family. Generally, it follows from (9.2) that any infinitely 
divisible distribution may be regarded as a product of a normal distribution 
and a finite, enumerable or continuous set of Poisson distributions 
The representation of log <p(z) in the form (9 2) is unique It follows that 
the jiroblem of finding all possible factorizations of an infinitely divisible d.f. F 
can be completely .solved, as long as we lestrict ourselves to factors which are 
themselves infinitely divisible. In fact, in order that 

F = Fi*Fi, 

where all three d f .’s belong to I, it is necessary and sufficient that the logarithms 
of the corresponding c f.’s should bo of the form (9.2), with 

/3 = di + ^2 , T = 7i + 72 ) 

in = iHt d~ j iV = iVi 7^2 

In the two simple cases of the normal and the Poisson distributions, the 
decompositions obtained in this way remain the only possible, even if we remove 
the restriction that the factors should belong to J. Thus in any factorization 
of a normal distribution, all factors are normal (Cramdr, [8]), while in any fac- 
torization of a Poisson distribution, all factors belong to the Poisson family 
(Raikov, [75]). For the type III distribution, and the non-normal stable dis- 
tributions, however, the corresponding property does not hold. 

In some cases, an infinitely divisible distribution may be represented as a 
product of indecomposable distributions, or as a product of an indecomposable 
distribution and another infinitely divisible distribution. The results so far 
obtained m this direction (L6vy, [G3], [64], Khintchine, [46], [47]; Raikov, [76]) 
are all concerned uTtb more or loss particular cases, and the general factoiiza- 
tion problem for infinitely divisible distributions still remains unsolved. A 
particular ease of some interest would be the case when the functions M and JV 
are both absolutely continuous. There does not seem to have been given any 
example of this type, where a factor not belonging to I may occur ^ 

Finally we mention a general theorem due to Khintchine, [46], which asserts 
that an arbitrary d f . F may be represented in one of the forms 

F = G, F = H or F = G * H, 

where G is infinitely divisible, while II is a finite oi infinite product of inde- 
composable factors. This seems to be practically the only result so far known 
concerning the factorization of a general distribution. 

A certam number of the results mentioned above have been generalized to 
multi-dimensional distributions. 


‘While the present paper was being printed, I have proved that such factors do occur, 
as soon as at least one of the derivatives M' and N' is bounded away from zero in some 
interval (—a, 0) or (0, a). ' 
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10. The Laws of large numbers. In modern terminology, the classical 
Bernoulli theorem may be expressed in the following way. Let Xi , • be 

a sequence of independent variables, such that each x^ may only assume the 
values 1 and 0, the corresponding probabilities being p and q = 1 — p. Then 
the arithmetic mean 


( 10 . 1 ) 


Zfl _ ail + • ' • + ain 

n « 


converges in probability to p, as w — > “ . 

Both classical and modern authors have laid down much work on the gen- 
erahzation of this simple result in various directions. Generally, we shall say 
that a sequence of random variables Xt,Xi, • ■ • satisfies the Weak Law of Large 
Numbers if there exist two sequences of constants fli , fla , ■ ■ • and hi , 62 , ■ • • , 
such that a„ > 0, and 

Zn — hn ^ a:i + • • • + ain — h« 
dn 


converges in probability to zero 

Let xi, Xi- 'be independent variables, such that a;„ has the d.f. F,{x). 
It has been shown by Feller [27] that for any given sequence ai, ch, the 
conditions 


( 10 . 2 ) 


E f dF,ix) = 0(1), 

V-l 


E f x^dFp{x) = oia\), 

u-l Jl*|<a, 


are sufficient for the validity of the weak law of large numbers, and that the 
corresponding sequence hi , 62 , • • ■ can be defined by 


i>, - 


t / 


X dFv{x). 


When there is a constant c > 0 such that for all v 


(10.3) F,(+0) > c, FX-0) < 1 - c, 

the conditions are also necessary This theorem contains as particular cases 
all previously known results in this direction. A simple NS condition for the 
existence of at least one sequence 01,02, • such that 10.2 holds does not seem 
to be known. 

When the weak law is satisfied, this means that, lor any given e > 0 and for 
any fixed large n, there is a probabihty very near to 1 that the sum Sn = a:i + 
• • + will fall between the lirmts h„ ± . The more stringent condition 

that, -with a probability tending to 1 as n — > 00 , will fall between the limits 

h, ± ea, for all values 0/ r g ?i is equivalent to the condition that — con- 

an 
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verges almost certainly to zero. When this holds, we shall say that the variables 
X, satisfy the Strong Law of Large Numbers The most important result so far 
known m this connection is concerned with the case an — n, and is expressed 
by the following theorem (Kolmogoroff, [52], [55]): 

When the Xv are ^ndependent and (10.S) holds, a sufficient condition for the valid- 
ity of the strong law with Un = n consists in the simultaneous convergence of the 
two series 



Some improved conditions of this type have been given by Marcinldcwicz 
and Zygmund, [73], but the problem of finding a NS condition for the strong 
law is still unsolved, even in the case = n 
Important generalizations of the laws of large numbers to cases when the 
X, are not assumed to be independent have been given i.a. by Khintchine [4.4], 
Ldvy [62], [63] and Lofeve [67] 

11, The central limit theorem and allied theorems. It was already known 
to De Moivro that, in the case 10.1 of the Bernoulli distribution, the d.f. of 
the normalized sum 


n:i -I -\- Xn —np 

y/ npg 

tends, as n oo , to the normal d.f. (t>{x) Considerably more general results 
in this direction were stated by Laplace. After a long series of more or less 
' successful attempts, a rigorous proof of the main statements of Laplace was 
given in 1901 by Liapouneff, [65]. More general cases were later considered i.a. 
by Lindeberg [66], Ldvy [61], [63], IQiintchine [43] and Feller, [25]. The follow- 
ing final form of the Central Limit Theorem is due to Feller 
Consider the expression 

( 11 . 1 ) Un — - , 

0/n 0/n 

where the Xy are independent variables. We shall say that the x, obey the 
central limit law, if the sequences [o,] and [54 can be found such that the 
d.f. of u„ tends to 0 (.t) as n — » <» . In order to avoid unnecessary complica- 
tions, we shall restnet ourselves to sequences [a,] such that 


Uy —> + 


^ _> 1 
(Xv 


and we shall assume that the conditions (10.3) are satisfied Then Feller’s 
theorem runs as follows: 
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The independent varinbles Xi , , • obey the centTol limit law if, o/nd only if, 

there exists o, seguence Qn — ^ °° such that simuUcineously 


( 11 . 2 ) 


E f dF,ix) ^ 0, 

v=l 

4 S f x^dFvix) » . 

v«»l J|ail<«M 


When these conditions are satisfied, explicit expressions for the a„ and 6„ can be 
obtained. 

Feller’s theorem gives a complete solution of the problem. However, we 
might still try to express in a more direct way the condition that the q„ should 
exist. We may also ask what happens when the conditions (11.2) are not 
satisfied. Some particular cases of the latter question will be considered below. 
However, very few general results aie known in this diiection 

The central limit theorem has been extended in various directions. Bern- 
stein [3], L6vy [62], [63], Lohve [67] and others have considered cases where the 
X, are not assumed to be independent Important results have been reached 
but still much remains to be done 

On the other hand, several authoi-s have considered symmetrical functions, 
other than sums, of n independent random vanables. The problem of investi- 
gating the asymptotic behaviour of the distributions of such functions, as n 
tends to infinity, is of groat importance in the theory of statistical sampling 
distributions. It is known (ef. eg Cram4r, [15]) that under certain general 
regularity conditions there exists a normal limiting distribution. However, it 
is also known that it is possible to give examples of particular functions (such 
as e g the function which is equal to the largest of the n variables) , where there 
exist limiting distributions which are non-normal The conditions under 
which this phenomenon may occur seem to deserve further study. 

A further problem belonging to the same order of ideas is to find a closer 
asymptotic representation of the d.f. of the standardized sum than that pro- 
vided by the normal function Consider e.g. the simple case when the x, 

are independent variables all having the same d.f. F{x) with a finite mean m, a 
fimte variance c^, and finite moments up to a certain order fc ^ 3 Let G*„(.'c) 
be the d.f of the variable 


ail + + a:„ — nm 

ay/ n 

It then follows from a theorem of Cram6r [5], [9] that, as soon as the d.f. F{x) 
contains an absolutely continuous component, there is an asymptotic expansion 

(11.3) G„{x) = <>(:r) -b Z + 0(n-'‘-«'“), 

iral TV 

where the constant implied by the 0 is independent of n and x Cramdr has 
also given similar expansions in more general cases, and his results have been 
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further extended by P. L. Hsu [39], who deduces analogous expansions also for 
other functions than sums The most general conditions under which expansions 
of this type exist aie still unknown 

It follows from (11 3) that the difference G„(x) — <t>(x) is, for any fixed x, 
of the order as n, — > . It is often important to know the asymptotic 
behaviour of Gnix) when n and n; increase simultaneously, and in that case (11.3) 
yields only a trivial lesult This case has been investigated by Cramer [10], 
and Feller [29], and the results so far obtained permit important applications to 
the so called law of the iterated logarithm (cf. below). However, it seem.s likely 
that similar results may be obtained in considerably more general cases than those 
hitherto investigated. 

A further interesting type of problems belonging to this order of ideas may 
be approached in the following way. Consider the variables (11 1) in the par- 
ticular case when Xi , Xi , ■ ■ ■ are independent variables all having the same 
d.f F(x). When the a„ and h„ can be found such that the d.f . of the normalized 
sum Un tends to </)(a:), we shall say that F belongs to the domain of allraction of 
the normal law. Feller’s theorem gives a NS condition that this should be so 
Noav Avhen this condition is not satisfied, it may still occur that the a,, and b„ 
can be so chosen that the d f . of tends to a limiting d f 'i'{x), which is neces- 
sarily different from <i>(x). Then it is easily seen that ^(.i;) must be a stable 
distribution, with its c f. defined by (9.1), and it is natural to say that F belongs 
to the domain of attrjiotion of 'F NS and sufficient conditions that this should 
hold have been given by Doeblin [16], and Gnedenko [34]. When the o„ and 
bn cannot be found such that the d.f of the normal sum Un converges to a limit, 
it may still be possible to obtain a limiting d.f. by considering only a partial 
sequence , ■ • Khintchine [47] has proved the interesting theorem 

that, the totality of limiting d f ’s that may be obtained in this way coincides 
with the class of infinitely divisible d.f.’s defined by (9.2). There are also 
furthei result.s in the same direction given by Barvly [2], Khintchine [44], L6vy, 
[61] -[63], and Gnedenko, [35]. 

12. The law of the iterated logarithm. Consider a sequence of independent 
vaiiables Xi , x^, ■ • , such that the mean Exn = 0 for all n, while the variances 
Ex\ = a\ are finite. Put s'l = v? -f • • -f <j\, and suppose that the variables 
obey the central limit law with a„ = s„ , 6„ = 0. (In particular this will be 
the case when all have the same distribution.) For any function ^(n) tending 
to infinity with n we then have 

(12.1) lim P(| Zn I > Sn4'{n)) = 0. 

n— tso 

On the other hand, if ^(n) tends to a finite limit > 0, the same probability 
has a positive limit. 

It seems natural to consider the relation within the brackets in (12.1) not 
qnly for a single large value of n, but to require the probability that this relation 
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holds simultaneously for an infinite numbei of values of n. The development 
of this problem has led to the so called law of the iterated logarithm. 

We shall in this respect use the following terminology due to L6vy. A non- 
decreasing positive function will be said to belong to the lower class with 
respect to the variables Xr, if, with a probability equal to one, there are infinitely 
many n such that 

I 0n I > s„4/{n). 

On the other hand, ^(n) will be said to belong to the upper class if the prob- 
ability of the same property is equal to zero. 

Every ^p{n) belongs to one of these two classe.s. This is a special case of the 
so called null-or-one law : if S is a Borel set in the space of the independent random 
variables xi , Xs, ■ ■ , such that any two points differing at most in a finite num- 
ber of cooidinates either both belong to S or both belong to the complementary 
set, then P(»S) can only assume the values 0 or 1. 

It was proved by Kolmogoroff [61] that, subject to certain restrictions, the 
function 

^(n) = Vc log log s„ 


belongs to the lower class for any c < 2, and to the upper class for any c > 2, 
which may be expressed by the relation 


( 12 , 1 ) 


P ( lim sup 


s« V2 log log «„ 



More general results were proved by Feller [30], who proved i a that, subject to 
certain restrictions, ^(n) belongs to the lower or upper class according as 

(12,2) S 4 


is divergent or convergent (m certain special cases, this had been previously 
found by Kolmogoroff and Erdos [24] Feller also proved a more compli- 
cated result, which contains the above as a particular case, and from which 
it follows that the simple criterion (12 2) no longer holds when the restrictions 
imposed m its proof are removed. 


13. Convergence of series. For any sequence of random variables a;„, the 
probability 


(p. 


converges 


has a uniquely determined value When the Xn are independent, it follows from 
the null-or-one law that this probability is either 0 or 1. By a theorem of 
Khintchine and Kolmogoroff [48], the value 1 is assumed when and only when 
the three series 

Zf dPr., ZByn, 
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are convergent, where 

x„ when I I ^1. 

J/n = 

0 when | a;„ | > 1. 

For the case when the are not assumed to be independent, various results 
have been given by L6vy [03] and others, but our knowledge of the properties 
of these senes is still not very advanced. 

14. Generalizations. In several instances it has been pointed out above 
that the results concerning sums of independent variables may, to a certain 
extent, be extended to cases when the variables are not independent. Generally 
the independence condition has then to be replaced by some condition restricting 
the degree of dependence Results of this type were first give by Bernstein 
[3], and then in more general cases by L6vy [62], [C3], and Lobve [67] However, 
this field has so far only been very incompletely explored 

Similar remarks apply to the generalization of the various theorems quoted 
above to cases of variables and distributions in moie than one dimension 

III. STOCHASTIC PROCESSES 

16. The theory of random variables in a finite number of dimensions is able 
to deal adequately with practically all problems considered in classical prob- 
ability theory. However, during the early years of the present century, there 
appeared in the applications various problems, where it proved necessary to 
consider probability relations bearing on infinite sequences of numbers, or even 
on functions of a continuous variable. 

The mathematical set-up required for the study of such problems involves 
the introduction of probability distributions in spaces of random sequences or 
random functions (cf 5 above) . Generally, any process in nature which can be 
analyzed in terms of probability distributions in spaces of these types will be 
called a stochastic process It is convenient to apply this name also to the prob- 
ability distiibution used for the study of the process We shall thus say, e.g., 
that a certain random function x{t) is attached to the stochastic process which 
IS defined by the probability distiibution of x{i) In the majonty of applica- 
tions, the variable i will represent the time, and we shall often use a terminology 
directly referring to tliis case However, there are also other types of problems 
in the applications {t may e.g. be a spatial variable in an arbitrary number of 
dimensions) , and it is obvious that the purely mathematical problems connected 
with these classes of probability distributions will have to bo considered quite 
independently of any concrete interpretation of the vanable t or the funcion x{t) . 

A well-known example of tins type of problems is afforded by the Brownian 
movement Let x{t) be the abscissa at the time t of a small particle immersed 
m a liquid, and subject to molecular impacts. In every instant, the quantity 
x{t) receives a random impulse, and the problem arises to study the behaviour 
of x{t). According as we are content to consider x{t) for a discrete sequence 
of ^points, say for i = 0, 1, 2, • • • , or we wish to consider all positive values of t, 
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wc shall then have to introduce a probability distribution in the space of the 
random sequence a;(0), a:(l), • , or in the space of the random function x{{), 

where t > 0 We may then discuss such questions as the distribution of x{t) 
for a given value of t, the joint and conditional distributions of x(t) for two or 
more values of t, and, in the case of a continuous varialile L, continuity, difforen- 
tiability and other similar properties of the random function x(t) 

Wiener [82], [83] (cf also Paley and Wiener [74]) was the first to give a rigorous 
treatment of this process. Ho proved in 1923 that it is possible to define a 
probability distribution in a suitably restricted functional space, .such that the 
increment Ax{{) = x(t At) — x(t) is independent of x{l) for any At > 0. 
With a probability equal to 1, the function x{t) is continuous for all t > 0, and 
for any fixed t > 0, the random variable x{t) is normally distributed. 

Another example of stochastic processes studied at this stage occurs in the 
theory of risk of an insurance company Let x{t) denote the total amount 
of claims up to the time I in a certain insuianco company. A.s in the case of 
the Brownian movement, it may seem natural to assume that the increment 
Ax{t) is independent of x{t) On the other hand, x{t) is in this case an essen- 
tially discontinuous function, which is never decreasing, and increases only by 
jumps of varying magnitudes occurring for certain discrete values of t, which are 
not a priori Icnown Processes of this type were studied by F. Lundborg [69], 
[70], H Cram4i [G] and others, 

Further examples of particular processes were discussed in connection with 
various applications, but no general theory of the subject existed until 1931, 
when Kolmogoroff published a basic paper [63] dealing with the class of stochastic 
processes which will here be denoted as Markoff processes (Kolmogoroff uses the 
term “stochastically definite processes”), of which the two examples mentioned 
above form particular cases. The theory of this class of processes was further 
developed by Feller [26], [28] In 1934, Khintchine [42] introduced another 
important class of processes known as stationary processes. From 1937, the 
general theory of the subject ivas subjected to a penetrating analysis in a series 
of important works by Doob [18]-[22] ’ 

16 . Probabihty distributions in functional spaces. We have seen in 6 
above how a probability distribution in the space of all functions x{l) may be 
defined, when t vanes in an arbitrary space T Generally, we shall here con- 
tent ourselves to consider the cases when T is the set of all real numbors, or the 
set of all non-negative real numbers. Most results obtained for those cases 
will be readily generalized to cases when t varies in a Euclidean space of a finite 
number of dimensions. On the other hand, when T is enumerable, say consist- 
ing of the points t = 0, ±1, ±2, • ■ , so that we are concerned with a random 
sequence x{0), .x(±l), , the results for the continuous case will generally 

hold and assume a simpler form which will not be particularly stated here 

'* A further interesting paper by Doob has appeared while the present paper was being 
printed “Probability in function space’’, Bull Amer. Math Soc., Vol. 63 (1947), pp 16-30. 
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The case when T is a space ol an infinite numbei’ of dimensions does not seem 
to have been considered so far 

111 the present paragraph, it will be convenient to assume the function x(t) 
to be real-valued, but the generalization to a complex-valued x{i) requires 
only obvious modifications In the sequel wc shall sometimes consider the 
real-valued and sometimes the complex-valued case, according as the occasion 
requires 

Let now X be the space of all real-valued functions x(i) of the real variable 
t, where — “ < t < «> . According to 5, a probability measure P(S) is uniquely 
defined for all Borel sets S in X by means of the faimly of joint distributions 
of all finite sequences x(ii), • ■ , In fact, P(S) can be defined for a more 

general class of sets than the Borel sets For any set S in X, we may define 
an outer P-mcasure P(S) as the lower bound of P{Z) for all sums Z of finite or 
enumerable sequences of intervals, such that S Z. Further, the inner P- 
mcasure P((S)is defined by the relation P(S) = 1 — P(X — S). Wlien the 
outer and inner measures are equal, /S is called P-ineasurable, and P{S) is defined 
as their common value Any P-measurablc set differs from a Borelset by a 
set of P-measure zero. 

In many cases, this definition will be sufficient for an adequate treatment 
of the problems that wc wish to consider However, in o fcher cases we encounter 
certain characteristic difficulties, which make it desirable to eonsider the pos- 
sibility of amending the basic definition Thus it often occurs that we are 
interested in the probability that the random function x(i) satisfies certain 
regularity conditions in a non-enumerablc set of points t. We may, e.g , wish 
to consider the probability that x(t) is continuous for all i, that x(t) should 
be Lebesquo-measiirable for all t, that xit) g k for all t, etc Let S denote the 
set of all functions satisfying a condition of this type. It can then be shown 
that the inner measure P(S) is always equal to zero so that S is never measur- 
able, except in the (usually trivial) case when P(S) = 0 

Consequently many interesting probabilities arc left undetermined by the 
general definition of a probability distribution in X given above. The pos- 
sibility of modifying the definition so as to enable us to study probabilities of 
this type has been Ihoroughly investigated by Doob [18]. He considers a 
subspace Xo of the general functional space X, where Xa is chosen so as to 
contain only, or almost only, “desirable” fdnetions, i.c. functions satisfying 
such regularity conditions as seem natural with respect to the problem under 
investigation. We start from a given probability measure P(/S) in X, and ask 
if it 'is possible to define a probability measure in the restricted space Xo , which 
corresponds in some natural way to the given distnbution in X. Let So be 
a set m Xg , and suppose that it is possible to find a P-mcasurable set S in X 
such that SXo = So ■ According to Doob, a probability measure Po in Xq 
IS then uniquely defined by the relation 

P„(,So) = P(S) 
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if and only if the condition 

P(Xo) = 1 

IS satisfied. 

The problem is thus reduced to finding a subspace Xo of outer P-measure 1, 
such that Xo contains only functions of sufficiently regular behaviour. When 
this can be done, -wo can restrict ourselves to consider only functions xit) be- 
longing to Xo , the probability distribution in this space being defined by the 
measure Pq We shall then say that xit) is a random function, attached to a 
stochastic process with the restricted space Xo . Doob has obtained a great 
number of interestiiig results in this connection, e g with lespect to the problem 
of choosing Xq such that it contains almost only Lebesque-measurable functions, 
or such that the probability of the relation x(_l) 5 k has a well-defined value for 
all k In particular he has shown that the last problem can be solved for 
any given P-measure. However, our Icnowdcdge of the various possibilities 
which exist with respect to the choice of Xois still very incomplete, and it seems 
likely that further important results may be reached along tins line of research. 

An alternative method of introducing probability distributions in functional 
spaces has been used by Wiener [82], [83], (cf. also Palcy and Wiener, [74]) 
Consider a given probability measure 11 in an arbitrary space Q, defined for all 
sets S of an additive class C. Let x{t, oi) denote a function (real- or complex- 
valued, as the case may be) of the arguments t (real) and w (point in fi) , such that 
x{t, u) for every fixed t becomes a C-measurable function of w. On the other 
hand, when w is fixed, x{t, u) = xit) reduces to a function of the real variable t. 
Let Xo denote the set of all functions xit) corresponding in this ivay to points of 
Q. Further, let So = SXo , where /S is a Borel set in X, and let 2 denote the set 
of all points u such that xit, u) CZ So. Then S belongs to C, and a probability 
measure Pa in the functional space Xo is uniquely defined by the relation 

(16.1) Po(So) = n(s). 

The relations between the two modes of definition have been discussed by 
Doob and Ambrose [23] who have shown that they are largely equivalent 
However, it seems likely that in particular problems the one or the other pro- 
cedure may sometimes be the more advantageous, and further investigations 
on this subject seem desirable 

17. Processes with a finite mean square. Consider a stochastic process 
defined by a probability measure PiS) in the space X of all complex-valued 
functions xit) of the real variable L For any fixed to , the random variable 
xito) is then a complex-valued function of the variable point .^(^) in the space 
X, i.e. a point in the space 0 of all complex-valued functions defined on X. 
When fo vanes, the point Qij describes a “curve” in n, which then corresponds 
to our stochastic process. 
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Suppose, in. particular, that the mean square 

E\x(t) [" = f \x{t) \^dP 

is finite for any fixed value of t. This implies that for fixed I the function 
x{C) belongs to Lj over X, relative to the piobahihty measure P. The random 
variable x{t) may then be regarded as an element of the Hilbert space H of all 
complex-valued functions / belonging to L 2 over X, the inner product (/, g) of 
two elements / and g being defined by the relation 

(f,g) = ! fgdP = Em- 

The stochastic process to which x{t) is attached then corresponds to a “curve” 
in H (Kolmogoroff, [56], [57]), so that the well-known theoi-y of Hilbert space is 
available for the study of the process. In particular, convergence in the usual 
metric of Hilbert space is equivalent to convergence in the mean of order 2 for 
random variables. 

Let P's be the smallest closed linear subspace of H which contains all elements 
of the form aia;(ii) -j- • + anxiQ- If the covariance fmckon 

r{t, u) = (a:(i), x{u)) = E(xit)x(u)) 

is continuous for all real values of t and u, then xit) — >• 33 (^ 0 ) m the mean, as 
, and we shall say that the process x(() is continuous. For any continuous 
process, Hx IS separable When g{t) is a continuous non-random function of f, 
and x{t) is attached to a continuous stochastic process, the Hiemann-Darboux 
sums formally associated with the integral 

f g(i)x(i)dt 

are easily shown to tend to a limit y, which is an element of Hx , i.e. a random 
variable By defimtion, we may identify the integral with this variable y, 
and this integral will possess the essential properties of the ordinary Riemann 
integral (Cramer, [12]). 

The application of the theory of Hilbert space to stochastic processes seems 
to open very interesting possibilities. Some applications to particular classes 
of stochastic processes will be mentioned below. Futher important results be- 
longing to this order of ideas will be given in a work by K. Karhunen [40], which 
is in course of publication. 

18. Relations to ergodic theory. There is a close connection between the 
theory of stochastic processes and ergodic theory In ergodic theory, as sum- 
marized e.g. in the treatise of E. Hopf [38], we consider an arbitrary space fi, 
and a probability measure II, defined for all sets 2 belonging to the additive 
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class C. We further consider a one-parameter group of one-one transformations 
of [2 into itself (a "flow” in fl) such that the transformation corresponding to 
the parametei value i takes the point m = mo into at, while {at)u = coj+u. Let 
/(co) be a given function, defined throughout Q, and such that /(on) is C-measur- 
able for every fixed t The well-known ergodic theorems due to von Neumann, 
Birkhoff, Khintchine and others are then concerned with the asymptotic 
behaviours of mean values, Avhich in the classical cases are of the types 

fM + /(cil) + • + /(on-l) 

n 

or 

as n or T tends to infinity. (In the case of the latter expression, it is necessary 
to introduce some additional condition implying measurability in t ) 

Writing x{t, a) = /(«;), it is seen that to a given transfoimation group a at 
and a given function /(oj) , there corresponds a stochastic process in the sense of 
Wiener’s definition (cf 16) The space Xq of this process consists of all functions 
x{t) representable in the form x{t) = /(aij), when a — at, imries oi'^er 0. The 
corresponding probability measure Po is defined by (16.1) 

Thus any of the above-mentioned crgodic theorems may be expressed as a 
theorem concerning “temporal” mean values of the types 

,t(Q) + a;(l) H + - 1) 

n 

or 

1 r’" 

T Jo 

If, according to some reasonable convergence definition, Ave may assign a limit 
to either of these expressions, as a or T tends to infinity, this limit will be a 
random variable, and it is important to find conditions Avhich imply that this 
vanable has a constant value for “almost all” functions x{i), i.e tor all x{t) 
except at most a set of Po-measure zero. 

In the particular case Avhen a:(0), a:(l), • • • are independent variables all 
having the same distribution, the classical ergodic theorems yield simple cases 
of the laws of large numbers (cf. 10) The mean ergodic theorem of von Neu- 
mann gives the Aveak law, Avhile the Birkholl-Khintchine theorem gives the 
strong law. Some more general results belonging to this order of ideas Avill be 
mentioned in the sequel. 

It will be seen that the two theories are largely equivalent, and it seems 
likely that further comparative studies of the methods will be of great value to 
both sides. 

19. Markoff processes. Consider noAV a stochastic process, defined by a 
probability measure P{S) in the space X of all real-valued functions x{i) of the 



PROBLEMS IN PROBABILITY THEORY 


185 


real variable t For any ii < , there is a certain conditional probability 

Pixik) C S 1 x{ti) = Oi) of the relation x(fe) C S, relative to the hypothesis 
that x{ii) assumes the given value oj . Suppose now that this conditional prob- 
ability is independent of any additional hypothesis concerning the behaviour of 
x{t) for t < h, so that we have e g. for any U < ti < k and for any cto 

PixiU) C S I x{u) = ax) = P{x{Q d S 1 x(ix) = cii , x{U) = ao). 

In this case the process is called a Markojff process 
The general theory of this type of processes, which forms a natural gen- 
eralization of the classical concept of Markoff chains, has been studied in basic 
works by Kolmogoroff [53] and Feller [26], [28] Writing 

P(,x(0 ^ f|.'c(i!o) = oo) = F(f; i, no, k), 

where k < I, P will be the distribution function of the random variable x{t), 
relative to the hypothesis x(k) = no Then F satisfies the Chapmau-Kol- 
mogoroff equation 

(19-1) FiPi no ,to)= f F(^; t, ri, k) d,F(7?, ti, Oo, fo), 

J— OO 

which expresses that, starting from the state xiU) ~ ao , the state x(i) S ^ 
must be reached by passing through some intermediate state x(ii) = rj, where 
k < k < t. Subject to certain general conditions, it is possible to show that 
any solution of this equation satisfies certain integro-differential equations, 
which in some important cases reduce to partial differential equations of para- 
bolic type, and that the d f F is uniquely determined by these equations. How- 
ever, the general conditions mentioned above are in many cases difficult to apply 
to particular classes of processes, and it would be important to have further 
investigations concerning these questions. 

Markoff processes (not belonging to the subclass of differential processes, 
which will be considered in the following paragraph) appear in several important 
applications, e.g. in the theory of cosmic radiation, in ceitain genetical problems, 
in the theory of insurance risk etc. In these cases, we are often concerned with 
the class of purely discontinuous Markoff pi-ocesses, where the function a:(t) 
only changes its value by jumps If, in addition, there are only a finite or 
enumerable set of possible values for x{t), the Chapman-Kolmogoreff equation 

(19.1) reduces to 

( 19 . 2 ) TTikito , 0=2 TTijiU,, tl)Tr,k(.tx, 0 , 

J 

where ir,i(lo , 0 denotes the “transition probability”, i e. the probability that 
x{t) will be in the fcth state at the time i, when it is known to have been in the 
ith state at the time to . In matnx form, this equation may be written 

(19 3) n(fo, 0 = n(< 0 , <i)n(«i, 0. 

where n denotes the matrix of the Tru . 
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When only a sequence of discrete values of t are considered, we have here 
the classical case of Markoff chains, which has received a detailed treatment 
in the well-known book by Fr^chet [32] (cf . also Doob, [19]) . The case when t 
is a continuous variable has been treated by Feller [28], 0 Limdberg [71], 
Arley [1], and other authors. Some of the most important problems of this 
branch of the subject arc concerned with the existence of a unique system of 
solutions of (19 2) or (19.3), and with the asymptotic behaviour of the solu- 
tions for large values of i — fo . Though important results have been reached, 
there still remains much to be done here, and the same thing holds a fortiori 
with respect to the analogous problems for general Markoff processes. 

20. Differential processes. A particularly interesting case of a Markoff 
process arises when, for any At > 0, the increment Ax{t) = x(t -b At) — x(t) 
is independent of x{r) for t g The process is then called a differential process. 
Some of the earliest studied stochastic processes belong to tliis class, which 
contains in particular the two examples discussed above in 15. Further cases 
of such processes arise e g in the theory of radioactive disintegration and in 
telephone techmque. 

Let us suppose that a:(0) is identically equal to zero, and that the process is 
uniformly continuous in probability in every finite interval 0 £ t ^ T, i.e. 
that for any fixed positive e 

P( I xit + At) — x{t) I > e) — >• 0 

as At — > 0, uniformly for 0 g ^ T -Then it follows from the works of Ldvy, 
[60], [63J, lOiintchine [47] and Kolmogoroff [64] that, for any t > 0, the random 
variable x(t) has an infimtely divisible distribution, with a characteristic func- 
tion ip{z-, t) given by (9 2), where /3, 7 , Jlf(M) and N{u) may depend on t. 

In the particularly important case when the distribution of the increment 
x{t -b At) = x{t) does not involve t, but only depends on the length At of the 
interval, we say that the process is temporally homogeneous, and in this case 
we have 


logp(z; t) = flogp(z; 1), 

so that we obtain the general formula for (p{z] t) simply by replacing in (9.2) 
P, y, M{u) and N{u) by i/3, iy, tM{u) and tN{v) respectively, 

When i — > <», or i — »■ 0, the appropriately normalized distribution of x{£) 
tends, under certain conditions, to a stable distribution (Cramdr [7], Gne- 
denko [36]) When this limiting distribution is normal, there are sometimes 
even asymptotic expansions analogous to (11,3). Still, the problem of the 
asymptotic behaviour of the distribution for large t does not seem to be definitely 
cleared up. 

Khintchine [41] and Gnedenko [37] have given interesting generalizations 
of the law of the iterated logarithm (cf. 12) to processes of the type considered 
here. 
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The continuous process discussed in 15 in connection with the Brownian 
rnovement corresponds to the temporally homogeneous case when /3, M(m) and 
N(u) all reduce to zero, so that 

^(2) = 

which shows that the distribution of x{t) is normal, with mean zero and vari- 
ance 271 ! 

On the other hand, in the applications to the theory of insurance risk, y is 
zero, while M{u) and N(u) are connected M'ith the distribution of the various 
magnitudes of claims. In this type of applications, it is often very important 
to find the probability that x{t) satisfies an inequality of the form 

x{t) < a + U 

for all values of t. It follows from the discussion in 16 that the definition of 
a probability of this type is somewhat delicate. The problem, which can be 
regarded as an extended form of the classical problem of “the gambler’s ruin,” 
has been solved in certain particular cases. It leads to integral equations, 
which in the simplest case are of the Volterra, in other cases of the Wiener- 
I-Iopf typo (Oi'am 6 r [ 6 ], [13], Segerdahl [79], Tacldind [81]) 

21. Orthogonal processes. Consider now the case of a complex-valued 
x{i), and suppose that E \ x{l) p is finite for all t Without rcstiicting the gen- 
erality, we may assume that Ex(t) = 0 for all t. 

Suppose now that instead of requiring, as in the case of a differential process, 
that the variables x(t) and Ax{t) should be indejimdent when r g t, we only 
lay down the less stringent condition that these variables should be non-cor- 
related, i.e. that 

®(.T(r)Aa;(t)) = 0. 

We then obtain a process which is no longer necessarily of the Markoff tjqie. 
The condition implies that, for any two disjoint intervals {k , fe) and (tg , < 4 ), 
we have 

El(x(t 2 ) ~ x(k))(x(U) — a:(f3))] = 0, 

so that the “chords” corresponding to' two disjoint “arcs” of the curve in 
Hilbert space representing the process are always orthogonal (Kolmogoroff 
[56], [57]). A process of this type may accordingly bo called an orthogonal 
process. 

For a process of this type we have, writing E \ x{t) |” = F(/), F{L Af) — 
F{t) = E\x{t -\- M) — r(i) |“, so that F(<) is a never decreasing funclionofi. 
If F{t) is bounded for all t, we shall say that the orthogonal process is bounded. 
For a bounded orthogonal process, the Stieltjes integral 
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where gil) is bounded and continuous, may be defined as the limit in the mean 
of sums of the form 

V 

22. Stationary processes. Wlien we are concerned with a process representing 
the temporal development of a system governed by laws which are invariant 
under a translation m time, it seems natural to assume that the joint distri- 
bution of any group of variables of the form 

1) -b r), • , xiin + t) 

IS independent of t. A process satisfying this condition will be called a sta- 
tionary pToc'esB If a stochastic process is defined by means of a “flow” co — >■ u, 
in a space (cf. 18), the process will be stationary when and only when the 
corresponding flow is measure-preserving, i e. if the transformation o — > ax 
changes any C'-measurablc set S into a set St of the same measure 
Under appropriate conditions with respect to the measurability of x(t), the 
Birkhoff-Khintchine ergodic theorem holds for a stationary process, i.e there 
exists a random variable y such that ive have 

.(22 2) Po ^lim i jf x(t) dt = '^ = 1, 

where Po is the probability measure in a suitably restricted space in the sense 
of Doob. Further work seems to be required here, in order to make the situa- 
tion quite clear, also with regard to metric transitivity 
For a stationary process, any finite moment of the joint distribution of the 
variables (22 1) is obviously independent of r. Suppose now that we only re- 
quire that this invariance under translations in time should hold for moments 
of the first and second order of the joint distributions, w'hich are assumed to 
be fimte The wider class of processes obtained in this way may be called 
stationary of the second order. Processes of this type have been studied for the 
first time by Khintchine [42] We shall assume that x{t) is complex-VEilued. 
Without restricting the generality, we may further assume that Exit) = 0 for 
all t. The product moment E{x{i)x{u)) will then be a function of the difference 
t — u: 

(22 3) E{x{t)x{u) = R{t — u). 

Assuming, in addition, that P(f) is continuous at f = 0, it follorvs that P(i) 
is continuous for all t, and the process is continuous in the sense of 17. It was 
shown by Khintchine that a NS condition that a given function P(/) should 
be associated with a second order stationary and continuous process by means of 
the relation (22.3) is that we should have 

Bit) = £ e’** dFix) 


(22.4) 
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for all t, where the spectral function F(jr) is real, never decreasing and bounded 
In paiticular, we have 

Fi+^~) - F{- co) = R(0) = E I xii) = c-\ 

Khintchine’s condition for /?(/,) was generahzcd by Cramdr to the ease of an 
arbitrary number of procesaes ai(i)i • ■ ■ , such that the product moments 
E{x,{t)x^) are functions of the difference I — u The corresponding spectral 
functions F^jix) are in general complex-valued and of bounded variation Fur- 
ther, the expression (Cram4r, [12]) 

« 

S ZiZ,AF„, 

where AFi, = F,j(b) — F,,{a) is, for any a < 6 , a non-negative Hermite form in 
the variables Si . This result is closely connected with a theorem on Hilbert 
space considered by Kolmogoroff and Julia. It is further shown that, to any 
given functions Fij{x), {i, j = I, ••• ,n), satisfying these conditions, we can 
always find n processes Xi{t) , • , Xn(t) such that the joint distribution of any set 
of variables is always normal, while the covariance functions R„{t — ti) = 
E(x,{t)x,{u)) are given by the expression 

RM = 

For a process x{t) which is continuous and stationary of the second order> 
with Exit) = 0 for all t, rve have the mean ergodic theorem 

(22.5) U.m ^ f\-^^‘x(t)d( = y 

r-^oo ^1 tt-y 

for any real h The random variable y has the mean 0 and the variance F(X -f 0) 
- F(X — 0), where F is the spectral function appearing in (22.4) If X is a 
point of continuity for F, it thus follows that y = 0 with a probability equal 
to 1 . On the other hand, if X is a discontinuity, y has a positive variance. Let 
Xi , Xa , be all the discontinuities of F(x), and let vi , o-a , • ■ ■ be the cor- 
responding saltuses, while i/i , 2 / 2 , • - arc the limits in the mean obtained from 
(22 5) for X = Xi , Xj , • ■. Then two different y, arc always orthogonal; 
E{y,yje) = 0 for j 5 ^ k, and we have 

( 22 . 6 ) xit) = + m, 

where B^(t) = 0 and 

V 

If F{z) is a step-function, we have v = 2, vJj and it follows that $(i) = 0 
vdth a probability equal to 1, so that (22.6) gives a “stochastic Founer expan- 
sion” of x{t) (Slutsky, [80]). 



190 


HARAIiD CRAM^IE 


Even, when F{x) is arbitrary, we can obtain a spectral representation of x{t) 
generalizing (22.6) In fact, it can be shown (Cram6r, [14]) that xit) can always 
be represented by a Fourier-Stieltjes integral 

(22 7) x(t) = f e'‘“dziu), 

«L.«o 

where z[u) is a random function attached to a bounded orthogonal process 
(cf. 21), such that 

E\z{u -\- Au) — z(u) 1“ = F(u + Au) — F(u) 

Conversely, we have 

/ oo -~tt(u+Au) — ilu 

— x(t) dt, 

ao JliTTht 

so that there is a one-one correspondence between 'r:[i) and Az{u). The integrals 
(22 7) and (22 8) are defined as limits in the mean, as shown above in 17 and 21, 
These results are in close correspondence mth generalized harmonic analysis for 
an arbitrary function, as developed by Wiener [83] and Bochner [4]. The spec- 
tral representation of a stochastic process has important applications, some of 
which will be considered in a foithcoming paper by Karhiinen [40]. An exten- 
sion of the spectral representation to a more general class of processes has been 
given by Lo^ve [68] 

When, in particular, the x(t) process is such that the joint distnbution of any 
group of variables .-r(h), ■ ■ , «(/») is normal, it follows that any increment 
Az{u) IS noimally distributed. Since two uncorrelated normally distributed 
variables are always independent, it follows that in this case the z(u) process 
IS a differential process with normally distributed increments. Important 
results for this case have recently been given by Doob [22]. 

The properties of continuity, differentiability etc for processes of the ts^pe 
here considered are still incompletely known, and further work is required. 
A further group of important unsolved problems are connected with an inter- 
esting decomposition theorem by Wold [84], which holds for processes with 
a discrete time variable The generalization of tliis theorem to the continuous 
case does not seem to have so far been given in a final form. 
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THE ESTIMATION OF DISPERSION FROM DIFFERENCES! 

By Anthony P. Moese“ and Fkanic E, Grubbs 

Ballistic Research Laboratory, Aberdeen Proving Ground, Maryland 

Summary. The estimation of variance by use of successive differences of 
higher order is discussed in this paper. Heretofore, attention has been focused, 
in published works, on estimates of variance obtained by employing the sum of 
squares of deviations from the mean and also by using mean square successive 
differences of the first order [1], [2], [3], [9]. A concise description of the method 
employing differences of any order with appropriate formulae for the precision 
of estimates so obtained and also a practical example on the use of the technique 
are given in section 11. Fundamental contributions to the estimation of 
variance from higher order differences, a study of the efficiency of the techmque 
and proper orientation of the subject matter in the field of mathematical statis- 
tics are given in sections 2-10 of the paper 
1 . Introduction. It frequently happens that successive observations, made 
at regular intervals of time, are subject to the same standard error while the 
means of the populations from which they are drawn display some kind of trend. 
The type of tread we speak of is brought about because of the manner in which 
we have to take measurements or because of variations in the measuring tech- 
nique itself, or, again, the trend may be characteristic of the thing we are meas- 
uring In any event, we may desire to eliminate the trend in order to study 
residual effects. As an example, it is desirable in the field of ballistics to evaluate 
the dispersion of machine guns finng from a moving airplane 
It may also happen that it is either inexpedient or impossible to estimate the 
standard error of the observations by the method of least squares, for in a large 
number of cases the type of trend is unknown. In this event a method employing 
differences of an appropriate order may prove valuable. The method consists 
merely of arranging the data in a vertical column in the order in which the obser- 
vations were taken and then forming difference columns m the usual way of 
order 1, 2, up to say 5 or some other number depending on the peculiarities of the 
problem at hand and the number of the original observations. Next, sum the 
squares of the numbers in each column and divide the sum of squares of the pth 

order differences by (n - p) . When n > 2 and p > 1, the numbers thus 

arrived at are all unbiased estimates of the population variance a for the case 
where all the observations have the same expected value, In section 11 at the 

!Thia paper is baaed aubstantially on a Ballistic Research Laboratory Report [10] 
of the same subject by Morse and has been prepared for publication by Grubbs at the sug- 
gestion of R H. Kent The authors are grateful to J V Lewis and H. L. Meyer for their 
many and varied comments, criticisms and suggestions 

2 Now at the Umversity of California, Berkeley, California. 
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end of the paper will be found a summary of this method, formulas by which 
the precision of the estimate of the variance a may be determined, and an exam- 
ple displaying the stability of this estimate with respect to •p. 

If a strong trend is present then the method of first differences will obviously 
yield an estimate of variance which is fictitiously large and the temptation to 
pass to higher order differences may quite reasonably be yielded to As a matter 
of fact, unbiased estimates may be hoped for from pth order differences whenever 
there is good reason to suppose that the pth derivative of the trend function is 
HTnall most of the time. However, even in the case of a sinusoidal trend where 
all derivatives have the same magnitude one may obtain good results from. higher 
differences provided there are at least seven observations in each interval of 
length one period (see section 5 and Table II below) . In connection with trends 
such as the sinusoidal type, the hopelessness of getting, say, even a fifth degree 
polynomial to fit over an interval of, say 20 periods is rather evident. It is 
for the above reasons that estimation of variance from higher order differences 
deserves consideration. 

2. Historical comment. A brief historical development of the interest in 
successive differences as a means for estimating dispersion is given in [3]. This 
paper discusses the statistic 



suggested by “Student” [W. S. Gossett] and E. S. Pearson and points out the 
relevant work of Jordan, Plelmert, Vallier, Cranz, and Becker. It seems that 
Jordan devised methods based on sums of powers of the differences, whereas 
Helmert gave more careful consideration to the case of the first power, i e. the 
sum of absolute differences. Heference [3] points out, however, that in these 
two cases all the n(n — l)/2 differences that can be established from a sample of 
n observations were included in the estimates of dispersion recommended by 
Jordan and Helmert, so that the estimate was of no value in reducing the effect 
of a trend Contmuing the remarks of [3], we learn that in ballistics Vallier 
appears to have been the first to estimate dispersion from successive differences 
and that Cranz and Becker commended the mean successive difference 


Tt—l 

p _ S 1 Xi+l - xt 1 

— T-1 

n — 1 


in estimatmg dispersion in range of guns since they were aware of variable ex- 
ternal effects (such as tail winds) on a projectile. In this country, Bennett [1] 
appears to have suggested the use of successive differences independently of 
European ballisticians. In this connection, Bennett suggested that the probable 
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TABLE 1 


'rhc Efficiency, W(n, p), of As An Eslimale of 


\. 

\ 

\ 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

2 

1,00000 










3 

.80000 

50000 









4 

75000 

.46154 

.33333 








S 

72727 

46552 

,32000 

.26000 







6 

71429 

47213 

33149 

24427 

20000 






7 

70588 

.47771 

34453 

.25510 

19672 

.16667 





8 

70000 

48214 

.35637 

.26871 

20633 

.16471 

14286 




9 

60565 

.48568 

.36408 

28071 

21888 

17274 

14159 

,12500 



10 

69231 

.48855 

,37113 

29071 

23068 

18385 

14830 

.12414 

.11111 


11 

,68966 

49091 

.37691 

29904 

24070 

19476 

.16802 

,12978 

11050 

.10000 

12 

68760 

49288 

38173 

30602 

.24934 

20450 

16708 

13827 

11629 

00966 

13 

68571 

49455 

.38580 

,31194 

.25672 

21300 

.17714 

, 14729 

12271 

.10366 

14 

68421 

49598 

38928 

31701 

.26308 

.22030 

18530 

.16581 

13086 

11018 

16 

68293 

49722 

,39228 

32139 

.26859 

.22684 

19260 

.16353 

.13874 

11754 

16 

68182 

49831 

39490 

32522 

27342 

.23251 

. 19887 

17045 

14601 

. 12481 

17 

68085 

.49926 

39721 

32869 

.27767 

23762 

20462 

. 17664 

15260 

.13162 

18 

.68000 

50011 

39925 

83158 

.28145 

.24197 

.20066 

. 18218 

. 16855 

. 13787 

19 

67925 

50087 

,40107 

33424 

28482 

24695 

.21407 

.18716 

16393 

. 14366 

20 

67857 

.50155 

,40271 

33663 

28784 

.24063 

.21813 

19164 

, 16870 

, 14876 

21 

67797 

.50216 

.40419 

33880 

29058 

.25276 

22181 

,19671 

17321 

,16347 

22 

67742 

.50272 

40563 

34076 

29306 

.26569 

22515 

.19041 

, 17723 

16778 

23 

.67692 

50323 

.40676 

34264 

29532 

26837 

,22819 

.20279 

18091 

,16173 

24 

.67647 

50370 

40787 

34417 

.29739 

.26082 

23098 

,20588 

. 18428 

16635 

26 

.67606 

50413 

.40889 

34667 

,29929 

.26307 

23354 

,20873 

, 18738 

10809 

26 

,67568 

.50452 

.40984 

.34706 

.30104 

,26614 

23590 

,21135 

.19024 

17177 

27 

,67533 

.50489 

.41071 

34833 

30266 

26705 

23809 

,21378 

19280 

17463 

28 

67500 

.50523 

.41162 

34951 

.30416 

,26884 

.24012 

,21603 

19535 

.17728 

29 

67470 

50555 

41228 

.35062 

.30555 

.27049 

24200 

21812 

19764 

, 17975 

30 

67442 

50685 

.41298 

.35165 

30686 

27203 

.24375 

22007 

19978 

, 18205 

31 

67416 

60612 

41363 

36260 

30807 

27347 

24639 

22190 

20177 

18420 

32 

67391 

50638 

,41426 

35350 

.30921 

,27482 

,24693 

,22361 

.20364 

18622 

33 

67368 

,60662 

.41482 

36434 

.31027 

.27608 

24837 

.22521 

.20539 

.18811 

34 

67347 

60686 

.41636 

,35513 

.31128 

.27727 

.24973 

,22672 

.20704 

18989 

35 

.67327 

60707 

41687 

35688 

31222 

,27839 

.26101 

.22814 

.20859 

10167 

.36 

67308 

,60727 

.41636 

.36658 

.31312 

,27945 

.26221 

,22049 

,21006 

19316 

37 

67290 

.50746 

.41671 

.35724 

.31396 

.28045 

.26336 

23076 

.21146 

.19466 

38 

67273 

50764 

.41724 

,35787 

31476 

28140 

,25443 

,23196 

,21270 

19606 

39 

67257 

50781 

41766 

35847 

31561 

.28229 

.25646 

,23309 

.21401 

19741 

40 

.67241 

,50797 

.41804 

.35904 

31623 

28314 

,26642 

.23417 

21619 

.19868 
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TABLE I — Continued 


\ 

\ 
n X 

\ 

1 

2 

3 

4 

5 

6 

? 

S 

9 

10 

42 

67213 

50828 

41876 

36009 

.31766 

28472 

.25822 

23617 

21738 

.20105 

44 

.67188 

50856 

41941 

,36104 

31877 

.28615 

25986 

23799 

,21037 

20320 

46 

67164 

.50880 

42000 

.36191 

.31987 

.28745 

26135 

23966 

22118 

20616 

48 

.67143 

.60903 

42055 

36271 

32088 

28865 

.26271 

24117 

.22284 

20695 

60 

.67123 

60925 

42105 

.36343 

32180 

.28975 

26397 

24266 

22437 

20860 

62 

67105 

.60944 

,42151 

36411 

32266 

29076 

26512 

24385 

22578 

.21012 

64 

67080 

60962 

42193 

36473 

32346 

29170 

.26619 

.24504 

22708 

21163 

66 

.67073 

50979 

42233 

.36531 

32418 

29257 

26718 

24614 

22829 

21284 

68 

67069 

50995 

42270 

36585 

32487 

29338 

26811 

24717 

22941 

.21405 

62 

67033 

.51022 

.42337 

36682 

32609 

29484 

.26977 

24903 

23144 

.21624 

66 

67010 

.51048 

.42396 

.36767 

32718 

29612 

27123 

.26066 

,23322 

,21817 

70 

.66990 

.61069 

42447 

.36843 

.32813 

.29726 

.27262 

,25209 

.23479 

21987 

74 

.66972 

.51089 

42492 

.36910 

32898 

29826 

27368 

.25237 

,23619 

22138 

78 

66967 

.51107 

42534 

.36970 

32975 

29917 

,27471 

26462 

.23745 

.22274 

82 

.66942 

51122 

42571 

37024 

.33043 

29998 

27564 

25566 

23869 

22397 

90 

.66917 

.51160 

42636 

37118 

33162 

,30139 

27725 

26736 

24056 

22609 

98 

66897 

,61172 

42689 

.37197 

33262 

30257 

27860 

.26885 

24219 

22786 

106 

.66879 

.61192 

42736 

37263 

33346 

.30357 

27974 

26012 

.24368 

.22936 

114 

,66864 

61208 

42774 

1 37321 

33418 

30443 

,28071 

.26121 

24477 

.23065 

122 

,66851 

,51223 

42808 

,37370 

.33482 

.30518 

28156 

26216 

24681 

23177 

138 

66829 

61247 

.42864 

.37462 

.33585 

30641 

28297 

26372 

.24762 

.23362 

164 

.66812 

.61266 

42909 

.37617 

.33667 

30738 

.28408 

26496 

,24887 

.23508 

170 

.66798 

.51281 

,42944 

,37670 

33734 

.30817 

28498 

26506 

,24997 

,23627 

202 

.66777 

51304 

43000 

.37649 

33836 

30937 

.28635 

26749 

26164 

23808 

234 

,66762 

.61322 

43040 

37708 

33900 

31026 

.28735 

26860 

.25285 

23939 

266 

.66761 

.61336 

43070 

.37752 

33965 

31091 

28810 

,26944 

.26377 

24038 

330 

66734 

,51353 

43112 

37814 

34044 

31185 

28917 

27063 

.25508 

24179 

394 

66723 

61366 

,43141 

.37856 

.34097 

31248 

28990 

.27143 

25596 

.24274 

622 

.66709 

61381 

43178 

.37010 

34164 

,31327 

.29081 

27244 

26707 

.24394 

778 

66696 

61306 

43216 

.37963 

.34233 

,31408 

29173 

.27347 

.25819 

24616 

1290 

.66684 

.61409 

43246 

,38007 

34288 

.31474 

.20248 

27430 

26910 

24613 

2314 

66676 

.61418 

.43264 

38036 

.34326 

,31518 

29298 

.27486 

.26971 

.24680 

OD 

.66667 

1 61429 

.43290 

38073 

.34372 

31673 

29361 

27666 

,26048 

24763 
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error should be estimated from the root mean square successive differences as 
follows: 


P.E = .6745 



2(n - 1) 


In 1940, J. von Neumann and R. H Kent in [2] investig9.ted further the estima- 
tion of probable error from mean square successive differences (sums of squares 
of first differences). J. von Neumann, R. H. Kent, H. R Beilinson, and B. I, 
Hart [3] considered the distribution of 

^ 2 (a:.+i - a:.)' 
n — 1 


in a paper which appeared in June 1941. J. D. Williams [4] obtained the 
moments of , where 

S“ 


xY, 


n ,_i 


and indicated that the rth moment of ij is equal to the rth moment of 6“ divided 
by the rth moment of The distribution of the ratio of the mean square 
successive difference to the variance has been published by J. von Neumann 
[5], [6] and B. I. Hart tabulated the probability integral and obtained percentage 
points for this statistic ([7], [8]) . Indeed, it should be remarked that the statis- 
tical theory of successive differences is allied with the problem of serial correla- 
tion [9]. Finally, the use of squared differences of higher order than the first for 
estimating variance appears to have been suggested by A. A. Bennett. Quite 
independently, a treatment of the subject was given by Morse [10] in connection 
with problems on exterior ballistics. Various results on successive-difference 
estimation including significance tests have been given by Tintner [13]. One of 
Tintner’s tests involves the use of selected sets of differences 
3. Definitions and notations. Suppose the observations Xi , Xi , Xs , • ■ ■ x„ 
are made at times a = U < U < k < — < tn = h and the i, are uniformly spaced 
without error Let /(<,) be the true trend so that tj, = /(i.) is the mean of the 
population from which is drawn and «, = x, — tj, is a random error. Further, 
let p be a non-negative integer less than n and denote to the fth backward differ- 
ence of order p of a: by , i e. 


A^x. = A^ ^x. - A” 'x._i = ^ (-1)' x._r , 

r-O V / 
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We define the following: 


(1) 

5n,P = A 

1 

- E 


(' 


p) 

(2) 

dn,p ~ / 

1 

— i: 


( 

;) <” - 

p) 

(3) 

2 

— / 

1 

9.n\ 

— E (A'’:,.)^ 


( 

;)<»- 

p) 

(4) 

^n,p “ /, 

2 

9/n\ 

— E {a\ka\ 


( 

?)<’•- 
O / 

p) 


By E{u) we will mean the expected value of u, Avhereas the variance of u will 
be denoted by 

Var (u) ~ E{u — E{u)]^. 

Basically, we shall assume that the are sufficiently Gaussian and inde- 
pendent that 

5(6.) = E(e=) = 0, £?(€“) = 

M4 = ®(*t) = 3<7* , 

whenever i, j, a and § are positive integers for which 

I < i l<j<n. 

4. Expected values. We will now determine the mean or expected values 
of ^n,p and djifP • 



or 

(5) E{6l,p) = 

(see Lemma 1.3 of section 6 below), 


2 

a . 
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Continuing, we have 


E{d\,p) 




e{ E (A’’e. + A\n, 


(?) - P) 


(n - p) 


t 

\P/ -T+i 




or 

(6) Eid\,p) = «r* + /„,p . 

Consequently, we observe, d\,p is on the average larger than (t“ by the quantity 
v\,p . In a particular problem, therefore, we are faced with the situation of 
choosing that combination of n and p which (%) regulates the size of v\^p and (ii) 
gives the desired precision of our estimate of variance. 

6. The magnitude of v\,p . In order to study the size of v\,p , we will derive 
for this quantity an upper bound which will indicate the applicability of the 
method of differences to non-polynomial as well as polynomial trends. 

Now, 

A”!), = A’’/(h) = f f ■ ■■ f /'’’’(l/i - 3/2 ~ • • ' - 3/p) dpp dpp-i dyi, 

'Iti-i "O •'0 

where t, — = h, by straightforward integration. It will be convenient to 

change the order of mtegration; thus 

A7(<.) = / • • • / f - 3/2 - ••• - 3/p) d3/i dyp dy^ . 

Jq Jq Jt{^i 

Since, from Schwarz’s inequality it is clear that 

{/„ (i® - “) / (?(8)P ds 

whenever a and d are real numbers and g is integrable, we have 

{A” [ ■■■ f j — 3/2 — • • ■ - 3/p)P dyi dyp ■ ■ • dpi . 

•'O *'0 

Also, 

E [ '"I / ~ Vi - ■ • • - 3/p)}“ dyi dVp ••• dyi . 

1-P+l •'0 •'0 •'ip 

But for 0 :< r < (p — l)li = tj, — a we have 

f {/'"’(3/1 - 1 ’))“ d 3 /i = / " ' {/^"’(s)}* ds < f {/'^’(s)}* ds. 

h •'Ip— r Ja 
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Consequently 

t (A%.)' <h^ f ■■■ r f” ds dy^--- dy, = [“ (/'"'(s))’’ ds. 

•'o Jo Ja Ja 


Since h = 


b — a 
n — 1 


we have finally 


(7) 



{f'^\s)}^ds 
b — a 


which is an upper bound for in terms of the average value of the square of 
the pth derivative of the trend function 
If the trend function / is of the polynomial foim, ^ 


/«) = Harf 

r-O 


then the effect of the trend can be eliminated from our observations by estimating 
dispersion from (p + l)st differences. However, if it is known that the trend is 
of polynomial form, then an estimate of dispersion based on least squares would, 
of course, be better. In fact, it ivill be shown later that the precision of s'^n.p 
decreases markedly as p increases. The use of d\,p as an estimate of a is pri- 
marily of value when the type of trend is unknown, however, even when the type 
of trend is known the computational simplicity of d\ may offset to some extent 
its lack of optimum precision. 

Let us reflect on the magnitude of over a single period of a sinusoidal trend, 

say/(<) = sin t. In (7) we set o = 0, 6 = 27r and secure 


2 

^ n,p 


< 





Taking n to be the number of observations for a complete period, a tabulation of 
the upper bound for v\,p for this case is given in Table 11. Thus, when there 
are about seven or more observations in each interval of length one period, esti- 
mation of dispersion from higher order differences may prove of considerable 
value even for this rather extreme type of trend. 

6. Some combinatorial relations. Although we will ultimately establish 
expressions for the variances of and dtn.p , it appears desirable to give first a 
number of combinatorial relations which present themselves in the computation 
of moments. The relations are easily checked and most of them are possibly 
well known. Nevertheless, it will be convenient to record them for reference 
and in some instances to give proofs. In what follows it will be understood that 


— 0 whenever p and q are not such integers that 0 < g < p. 
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TABLE II 
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Proof: (s - _ J^jfroin 1.6. 

Put s = 2p, t = p — r, then 

"’■(p? ■■) ’ ■ '”{(?- “ ip-r-l))- 

Lemma 18 If f is a function, i, n, p are integers and p + I < i < n, then 




fir). 


Proof: 




Lemma 1 .9 If — < Air, s) = yl(s, r) < ^ for each integer r and s, then 

E Mr, s)e;-€i| ^ = (m 4 — 3o-^) X Air, rf 


+ <f^ Air, r) [> + 20-“ 2 X Air, s)l 


Proof: Let W(r, s) = 1 when r < s and let Nir, s) = 0 otherwise. Clearly 
2 S Air, r)er + 2J2 ^ir, s)Air, s)ere, , 

fal Ami rml fm! Sssal 

and 

E (|g £ A(r, - E ({i; ylfr, r),;}') 

+ 4 E (js S ^ir, s)A(r, s)er«A^ . 
\L=i "“i J / 

Now 

E Air, r)6r| ^ = (m4 — a) S A(r, rf + (A Air, ?-)| , 


and 


4E ({ss Nir, s)Air, s)«r€.|^ 

= 4(7^ S ^ir, s)A(r, s)“ 


A-.1 


= 2<r'‘ 2 £ Mr, — 2(r* X Mr, rf 

fml Ami Tal 
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The last three relations combine to yield the desiied result. 
Lemma 1.10 (n — p)^Ei5\,p) 

= It C^)T+'‘{§.tC 

Proof: Helped by 1.8, check that 


P 
— r 







Therefore 


Let 


= 0^)0- 


Er Ej 


and apply 1.9 to complete the proof. 
Lemma 1.11 


SS{..tC^)C- 

,’5’. C + .)’ - C%” 7 0’ ' 


Proof. 


§§ltC^)C- 

= ..t U C) C -i) C 4 - 7 
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= {? (0 C - .)}■ = ,l« C 4 - 1 


. "f 

(n 

i 

- V - 

■ 1 »’l) ( 

' 2p Y 
J} + V 




= (n - 

v) 

n— 3) 

S 1 

r— p— n 

< 2v \ 
\V + r) 

2 n~ 

\ -2l 

r— 

2p ^ 

^ Vp + n 

)■ 


= (n- 

v) 

fi— p 

X 1 

r*»p— n 

( 2p ^ 

\v + r) 

2 n— ' 

1 -2E 

T«( 

i'{C: 

.T-C 


= (n - 

v) 

n-p 

E 1 

jj— - ti 

( 2v \ 
\V + r/ 

|’-2p{ 


/ 2p 
\2p- 


= (?i - 

■ v) 

■f 

n 

( 2p ' 

\P + r 

)’-2p 

(“,-)■ 

+ 2p 



Lemma 1.12. 


Proof. 


S ,5. 0 - r)' - (p )■ 


i; t i; iL-)'- i SW.fromi.s; 

r— 1 »~ip+l V V i-"P+l r-l V V 1 — p+1 r— 0 V/ 

= {n-p)T, (^) = (n - p) 


7. The variances of 5’„ and d"„ ,p . In order to get some idea as to the efficiency 
of the statistics s\ , j, and d\,p , we will examine their variances . W e have 



(ti - py Var (3^.,p) 



(n - vf {e{b\,p) - mi,p)n 



p)v+2(„-p),‘_gx?^y 
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"with the aid of Lemmas 1 10, 1.11, 1 12 and using the relation — So-'* = 0. 
Thus, 




( 8 ) 

If 2p < n, then 




Moreover 

Therefore, 


/ 2p Y 2p Y = V ^ /4p\ 

\p + r) ^\p + rj \2pJ ' 


(9) 




■when 2p < n. 

As for the variance of d\,ji , ive have 

Var (dlj = E{d\,^ - vl., - o")* = + 7c„,„ + v*„,„ - - o'}' 

= E{&\,,-a^) + kn.,]\ 


or 

(10) Var {dlj = Var (5„,„)“ + E{klJ, 
since L7[(5’„,s — a'‘)k„,j,] = 0. 

However, from Schwarz’s inequality, it is guaranteed that 

Eik\,p) < 

Thus 

(11) Var dl,^ < Var (5^,^) + 4v\y. 

An upper bound has already been given for v\,p in section 5 above. 

8. The efficiency of 5* ,p . It is appropriate to consider the efficiency (as 
defined by Fisher [11]) of the statistic d\,p . In this sense, the efficiency of 
B\,p is ^ven by 


W{n, p) 


Var s\ 
Var d\,p 


where s\ 


'll (a:, — xY 


n — 1 
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Accordingly, 


Win, p) 


2<^ 

in - 1) Var (5^„„) 


or 



(ti - 


2 

1 

(71 - 1) «j 

2.C+r) 

)-2p{ 

^2p - IN 

K P J 



Win, p) 

(12) 

If 2p < n 

(13) Win p) = 

or 

(14) Win, p) = 


in 




(n 


- 1) {(” - p) (S) - f % ‘)} 
(>y 

\P/ 


from (9); 




if 2p < n. 


Formulas (12) and (13) were used m preparing Table I given at the end of the 
paper. For convenience in using formulas (1) and (2) the binomial coefficients 

for 0 < p < 10 are given in Table III. 


If n > 2, then 


(16) 


Win, 1) = 


1 - 


2(n - 1) 
3n. — 4 ’ 


3n — 3 


as was pointed out by von Neumann, Kent, Beilinson, and Hart in [3]. 
If n > 4, then 


IQ 

(16) 


18(71 - 2y 


1 + 


n 




18 


36(n - 2) 


(71 - 1)(3571 - 88) ■ 
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As a limiting value for n, we have 


(17) 


W(«> 


p) = Lim Wit!., p) = 



Using Stirling’s formula for the approximation to the factorial, we have 


Lim \^pW(°o , p) = /i/-. 

f ^ 


Thus, as p 00 , 1T'( 00 , p) tends to zero and is asymptotically equal to 



TABLE III 

( S'!) 


t 

(^/) 

0 

1 

1 

2 

2 

6 

3 

20 

4 

70 

5 

252 

6 

924 

7 

3432 

8 

12870 

0 

48620 

10 

184766 


For the case n > 2, p > 1 and f constant, then s\ = ^ and p 

n — 1 

and (i„,p are all unbiased estimates of the population variance cr*. Moreover, 
for this case 


W{n, v) 


Var {s\) ^ Var (s^) 
Var Var ' 


Using Sm based on m — 1 degrees of freedom and keeping the trend, /, con- 
stant, then m and n may be chosen so that approximately 

Var(siL) = Var(d“„,p) 

and for a normal population this means that 


m = I -h (n — l)lV(n, p). 
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Using Table I, it may be seen that for constant trend, /, the worth of djo.io as 
an estimate of <r for a normal population is about the same as that of su , whereas 
that of dlo, 1 is about equivalent to S40 However, if the trend / is not constant 
then the worth of si as an estimate of a is diminished while that of di.,, is 
increased. 

Similarly, if the trend is cubic over 20 observations then least squares gives 
an unbiased estimate of a based on 16 degrees of freedom, whereas dloA gives an 
estimate equivalent in precision to about 6.4 degrees of freedom. However, if 
only eight observations follow a cubic trend, then least squares furnish an un- 
biased estimate of based on four degrees of freedom whereas furnishes an 
estimate equivalent to about 1,9 degrees of freedom. Thus, in the case of 20 
observations, cubic least squares is, so to speak, 2.5 times as valuable as djo.i; 
in the case of eight observations, cubic least squares is 2.1 times as valuable 
as dg,4 . 

It might be mentioned that the method of differences is of value in estimating 
goodness of fit. If the fit is good, then our estimate of derived from least 
squares should on the average be equal to the estimate derived from a suitable 
d\.p . If the fit is poor then d\ will be smaller on the average than the former. 

9. The approximate probable error in estimating a from differences. The 

approximate standard error of is given by the relation 


S.E. (Sn,p) 


1 S.E. (sVp) ^ 

2 (T •\/2(n — l)W(n, p) ' 


If p has been so chosen that v\.j, is suitably small then [see equation (11)] 
some confidence may be put in the approximate formulas: 


(18) 


S.E (d„,,) 


<r 

V2(n - 


( 19 ) 


P.E. (d„,p) 


■ 6746(7 

•\/2(n — l)ir(n, p) ■ 


Formula (19) was used in preparing Table IV which gives the approximate 
probable error to be feared m using as an estimate of <r. This table should 
yield interesting information whenever p has been chosen so that d\ ,p is a suitably 
unbiased estimate of <7*. 


10. Remarks. We have presented a useful technique for estimating variance 
from higher order differences and have given the precision of our estimate. The 
method of estimating variance from higher order differences appears to be quite 
valuable in cases where the type of trend in our observations is unknown. A 
considerable field of work remains concerning a complete investigation of the 
distribution and other properties of the statistic d® ,p . In this connection, 

Baer [12] has already published a study on the stochastic limit of r . 

fl “* -L 

It is hoped that others will contribute to the problem of estimating dispersion 
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TABLE IV 


The Probable Error In Eshmaltng a From Differences* 


\ 

D 

1 

2 

3 

4 

5 

B 

7 

8 

9 

10 

1 

.4769 











2 

.3373 

4769 










3 

.2753 

.3771 

4769 









4 

.2384 

3180 

4054 

4769 








5 

2133 

,2796 

3495 

.4215 

4769 







6 

.1948 

.2524 

3104 

3704 

4404 

4769 






7 

1803 

2317 

2817 

3318 

3855 

4390 

4769 





8 

.1686 

2154 

2596 

3024 

3477 

3969 

.4442 

4769 




9 

1589 

2022 

2420 

2794 

,3183 

3604 

4057 

4481 

4769 



10 

.1508 

1911 

.2274 

2610 

2948 

3311 

,3708 

4128 

4513 

.4769 


11 

1438 

.1816 

2163 

2467 

2768 

3074 

,3417 

3794 

4186 

4537 

4769 

12 

1376 

.1734 

,2048 

2328 

2599 

2880 

.3180 

3608 

3867 

.4234 

.4558 

13 

1323 

.1663 

1958 

2217 

2465 

2717 

2983 

.3272 

3587 

3930 

4276 

14 

1274 

.1599 

.1878 

.2120 

.2350 

.2579 

2818 

3073 

.3351 

3666 

3984 

15 

1231 

1542 

.1808 

2035 

.2248 

2459 

2677 

.2905 

3152 

.3423 

3718 

16 

1192 

.1491 

1744 

1960 

2159 

2355 

2554 

2761 

.2983 

3223 

.3485 

17 

.1156 

,1445 

1687 

1892 

2080 

2262 

2447 

.2637 

2837 

3052 

3286 

18 

1124 

1403 

1636 

1831 

.2009 

2180 

2362 

2527 

2710 

2906 

3116 

19 

,1094 

1364 

.1589 

1775 

1945 

.2106 

,2267 

2430 

.2699 

2777 

2967 

20 

1066 

.1328 

.1545 

1724 

.1886 

2040 

.2191 

2343 

2500 

2663 

2837 

21 

.1040 

,1295 

1506 

1677 

1832 

1978 

.2121 

2264 

2411 

.2562 

2722 

22 

1016 

1265 

1468 

.1634 

,1783 

1922 

.2058 

2193 

2331 

2472 

2620 

23 

.0994 

1236 

1433 

1594 

1738 

.1871 

2000 

.2129 

2268 

2391 

.2529 

24 

0973 

1209 

1401 

1557 

1695 

.1824 

1948 

2069 

2191 

2316 

2446 

25 

.0954 

1184 

1371 

1522 

1656 

.1779 

.1898 

2016 

2131 

2249 

.2370 

26 

.0936 

1160 

1343 

1490 

1619 

1739 

1853 

,1964 

.2075 

2187 

2301 

27 

.0918 

.1138 

1316 

1469 

.1585 

1700 

1810 

1917 

2023 

2130 

2238 

28 

0902 

.1117 

,1291 

1431 

,1553 

1664 

1770 

1873 

.1975 

2077 

2180 

29 

.0885 

1097 

1268 

1404 

1522 

1631 

1733 

1832 

1930 

2028 

2126 

30 

.0871 

1078 

1245 

1378 

1493 

1599 

1698 

1794 

1888 

1981 

2076 

31 

0857 

.1060 

.1224 

1354 

1466 

.1569 

,1665 

1758 

1848 

.1938 

2029 

32 

0843 

,1043 

1204 

,1331 

1441 

1540 

1634 

1724 

l&ll 

.1898 

1985 

33 

0831 

1027 

1184 

1309 

,1416 

1514 

1605 

,1692 

.1776 

.1860 

1944 

34 

0818 

1012 

1166 

.1288 

1393 

1488 

1677 

1661 

1744 

.1825 

.1905 

35 

.0807 

,0999 

1149 

.1268 

.1371 

.1464 

.1660 

,1632 

1713 

1791 

1869 

36 

0795 

.0983 

1132 

1249 

1350 

1441 

.1626 

.1605 

.1683 

,1759 

1834 

37 

.0784 

0969 

,1116 

1231 

1330 

1418 

1501 

1579 

1666 

.1729 

,1802 

38 

0774 

0956 

1101 

.1214 

.1311 

1397 

,1478 

1555 

1628 

.1700 

1771 

39 

.0764 

0943 

.1086 

1197 

.1292 

1377 

1466 

,1531 

1603 

,1673 

1741 

40 

0754 

0931 

1072 

.1181 

1274 

.1358 

1435 

.1508 

1578 

1646 

1713 



ESTIMATION OF DISPERSION 


211 


TABLE IV — Continued 


\ 

■\ 

0 

1 

2 

3 

4 

5 

(i 

7 

8 

9 

10 

42 

0736 

0909 

,1045 

1151 

1241 

1322 

,1396 

1466 

1533 

1597 

1661 

44 

.0719 

0887 

,1020 

1123 

1211 

.1288 

1360 

1427 

1491 

.1553 

,1613 

46 

,0703 

.0868 

0997 

1097 

1182 

1257 

1326 

.1391 

,1453 

1512 

1570 

48 

.0689 

0849 

,0975 

1073 

1155 

.1228 

1295 

1357 

1417 

1474 

.1529 

50 

0675 

0832 

.0955 

1060 

1130 

1201 

1266 

1326 

.1383 

,1438 

,1492 

52 

0661 

,0816 

0936 

1029 

.1107 

1176 

.1238 

1297 

1352 

1405 

1457 

54 

0649 

.0800 

0918 

,1009 

1085 

1152 

1213 

1270 

,1323 

,1375 

.1425 

66 

0637 

0785 

,0901 

.0990 

,1064 

1129 

.1189 

1244 

1296 

1346 

,1394 

68 

0626 

0771 

0885 

0972 

1045 

1108 

1166 

1220 

1271 

,1319 

1366 

62 

.0606 

0746 

,0865 

0939 

.1008 

1069 

1125 

1176 

,1224 

1270 

1313 

66 

0587 

0723 

0828 

0909 

0975 

.1034 

1087 

,1136 

1182 

.1225 

,1266 

70 

0570 

0702 

0804 

.0881 

0946 

.1002 

.1053 

1100 

1144 

1185 

1224 

74 

0554 

0682 

0781 

0856 

.0919 

,0973 

.1022 

1067 

1109 

,1149 

1186 

78 

.0540 

0664 

0760 

0833 

0894 

0947 

,0994 

1037 

1077 

1115 

.1162 

82 

.0527 

0648 

0741 

,0812 

0871 

,0922 

0968 

1000 

1048 

1085 

1120 

90 

0503 

0618 

0707 

0774 

0830 

0878 

.0921 

,0960 

0997 

1 

1031 

.1063 

98 

0482 

0592 

.0677 

0741 

0794 

0840 

0880 

.0917 

0952 

,0984 

1014 

106 

0463 

0569 

,0660 

,0712 

0762 

0805 

0845 

0880 

0913 

0943 

0072 

114 

0447 

0549 

,0627 

0686 

0734 

.0776 

0813 

0847 

0878 

,0907 

0934 

122 

0432 

0530 

0606 

0663 

0709 

.0749 

0785 

0817 

0847 

0876 

.0900 

138 

0406 

0498 

,0569 

0622 

0666 

0703 

0736 

.0766 

0794 

0819 

0843 

154 

0384 

0472 

0538 

0589 

.0630 

0664 ‘ 

0695 

0723 

0749 

0773 

0796 

170 

0366 

0449 

0512 

0560 

.0599 

0632 

0661 

0687 

0711 

,0734 

.0755 

202 

0336 

,0412 

,0470 

0513 

0548 

.0678 

0606 

0629 

0650 

0671 

0689 

234 

0312 

0382 

0436 

0476 

0509 

0537 

0561 

0583 

0603 

0621 

.0639 

266 

0292 

0359 

,0409 

0446 

0477 

0503 

,0526 

0546 

,0565 

0582 

.0598 

330 

0262 

0322 

0367 

,0400 

0428 

0451 

0471 

0489 

.0505 

0521 

0535 

394 

,0240 

0295 

0336 

,0366 

0391 

0412 

0430 

0447 

0462 

0476 

.0488 

522 

0209 

0256 

0292 

.0318 

0339 

.0357 

0373 

0387 

0400 

0412 

,0423 

778 

0171 

0210 

0239 

0260 

0278 

0292 

0306 

.0317 

,0327 

0337 

.0346 

1290 

0133 

.0163, 

0185 

0202 

.0216 

0227 

0237 

.0246 

,0264 

,0261 

0268 

2314 

0099 

0121 

.0138 

0151 

0161 

0169 

0177 

0183 

.0189 

0196 

I .0200 


* If dn.p IS a sufficiently unbiased estimate of o-^ then the approximate probable error 
to be feared in using d„,p as an estimate of <r may bo obtained by multiplying the following 
tabular entries by er. 
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when observed data display trends as it is believed that the method of differences 
deserves much attention In particular, it is hoped that someone will have the 
time and ingenuity to calculate the distribution of the statistic 


Were this done, an admirable criterion would be at hand for gauging the signifi- 
cance of a change in the estimate of as we pass from differences of order p to 
those of order p + 1. Of course, useful infoimation in this connection could be 
obtained from a Icnowledge of the distributions of and ; in fact their 
variances as herein calculated give us a basis for somewhat reasonable conclu- 
sions. An expression for the standard error of the difference between the 
estimates of from two consecutive series of finite differences is given in 
[13, Chapter VI], 

In connection with testing goodness of fit, it would be valuable also to know 
the distribution of 

where S\,f is the estimate of variance derived from the least squares fitting of a 
polynomial of degree p. 

For convemence of reference, we conclude the paper with 

11. A concise description of the method and its precision. It frequently 
happens that successive observations made at regular intervals are subject to 
the same standard error v while the means of the populations from which they 
are drawn display a trend We give here a method of estimating the variance i?“ 
and of determining the precision of our estimate This method is primarily of 
value when the trend is unknown , however even when the type of trend is known, 
its computational simplicity may make the method advantageous. 


The method. Arrange the data in a vertical column and then in the usual 
way form difference columns of order 1, 2, • • , p Sum the squares of the pth 

order differences and divide by the number (n — Our estimate of (7“ 

is the number d^„,p , where 


dl.p = TAa £ (A’'a;0^ 

- p) 


‘ Dixou [9] gives moments of the statistic 


. ff 

23 - 2a:,+i -(- a!,+s)® 



n 

^ (a, - 


where a-nti = 


£tQ(l iCn-f-S ~ 
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The precision The precision of this estimate may be determined from the 
following information (which has been derived in the present paper) . 

E{d\,^p) = a + v\,p ; 


2 

^ n,p 


< 


fb ~ a\/b - ds 

2p\ \n — p)\n — l) b — a 

.v) 


Var(0 < Var(5l.p) + 4riy , 


Var {h\^) 


2g* 

(n - l)W{n, p) ’ 


where W{n, p) is given in Table I. 


TABLE V 


p 


"y 

' (Ts 

1 

18 90 

184 62 

11 22 

2 

1 21 

1 88 

10 56 

3 

88 

1 85 

10.30 

4 

87 

1.84 

10.12 

5 

.86 

1,83 

10 01 


In case v\,p is sufficiently small (this is determined by the requirements of the 
given problem) , then Table IV may be used directly to determine the approxi- 
mate probable error in using as an estimate of o- 

An example. As a practical example of the use of the method of differences 
when the trend is unknown and of the stability of the statistic with respect 
to p, we mention a recent problem at Aberdeen Provmg Ground which had to do 
with estimating the accuracy with which certain photographic measurements 
locate a moving object. Ballistic Cameras were used to determine horizontal 
X and y, and vertical z coordinates (all in feet) of an airplane traveling about 
160 mph at an elevation of about 35,000 feet An automatic pilot was in use m 
the airplane as it flew over a three mile course. At one second intervals for a 
period of 70 seconds two Ballistic Cameras, 6000 feet apart, were used to locate 
the plane. Since the plane was traveling pretty much in the y direction one 
would expect; that first differences would yield a standard error in y far in excess 
of its true one; that second differences would furnish a much better estimate; 
and that perhaps third differences would yield a still more trustworthy one. No 
matter what order of difference is used we never expect such an estimate to be 
too small. In this problem, the standard errors in x, y, z as estimated from dif- 
ferences of certainjorders, p, were as given in Table V. 
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THE EFFICIENCY OF SEQUENTIAL ESTIMATES AND WALD’S 
EQUATION FOR SEQUENTIAL PROCESSES 


By J. Wolfowitz 
Columbia University 


1. Summary. Let n auccessive independent observations be made on the 
same chance variable whose distribution function f(x, 6) depends on. a single 
parameter d. The number n is a chance variable which depends upon the out- 
comes of successive observations; it is precisely defined in the text below. Let 
• , a:„) be an estimate of 6 whose bias la Z)(0). Subject to certain regu- 
larity conditions stated below, it is proved that 

> (i + 

When /(k, 0) is the binomial distribution and 0* is unbiased the lower bound 
given hero specializes to one first announced by Girshick [3], obtained under no 
doubt different conditions of regulanty. When the chance variable nis a con- 
stant the lower bound given above is the same as that obtained in [2], page 480, 
under different conditions of regularity.’’ 

Let the parameter 0 consist of Z components 0i , • • • , 0j for which there are 
given the respective unbiased estimates 0f(a:i , ■ ■ , xj, ■ ■ ■ , ef(xi , • • , xj. 
Let II X.j il be tho non-singular covariance matrix of the latter, and H 11 its 
inverse. The concentration ellipsoid m the space of (Ici , ■ ■ , k) ia defined as 

ZV'Cfc, - 0.)(/Ci - 0 ,) = 1 + 2 . 

(This valuable concept is due to Cram4r) If a unit mass be uniformly dis- 
tributed over the concentration ellipsoid, the matrix of its products of inertia 
will coincide with the covariance matnx H X., 1|. In [4] Cramer proves that no 
matter what the unbiased estimates 0* ,•••, 0* , (provided that certain regu- 
larity conditions are fulfilled) , when n is constant their concentration ellipsoid 
always contains within itself the ellipsoid 

— 0,) = I + 2 

i'll 


where 


Hi, = nE 


d log / d log A 
, ddi ddj } 


^ To whom this result is to be ascribed is not cleai from the context m which Professor 
Cram6r describes it (in [2]) . After the present paper was completed the author learned of 
the papers by Rao [8] and Aitken and Silverstone [9], both of which deal with this question 
The author is indebted to Prof M. S Bartlett for drawing his attention to these papers. 
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Consider now the sequential procedure of this paper. Let , Qi be, 

as before, unbiased estimates of fii , ■ ,61, respectively, recalling, however, 
that the number n of observations is a chance variable. It is proved that the 
concentration ellipsoid of 6 * , ■ ■ ,0* always contains within itself the ellipsoid 

^j)= 1 + 2 


where 




When M IS a constant this becomes Cram 6 r’s result (under different conditions 
of regularity) . 

In section 7 is presented a number of results related to the equation 
EZn = EnEX, Avhich is due to Wald [ 6 ] and is fundamental for sequential 
analysis 


2 . Introduction. Let X be a chance variable whose distribution function 
/(x, 8) depends on the parameter 8 It is assumed that X either has a probability 
density function (which we then denote by f{x, 6)) or that it can take only 
an at most denumerable number of discrete values (in the latter case 
fix, 8) = P{X = x), where the latter symbol denotes the probability of the 
relation in braces) . Let u = Xi ,Xi , ■ be an infinite sequence of obseivations 
on X, and let U be the space of "points” o> Let there be given an infinite 
sequence of Borel measurable functions (pi (xi),<p2(a:i , X2), • • • , tpjixi , ■ ■ ■ ,Xj), ■ • ■ 
defined for all « in SI , such that each takes only the values zero and one. It is 
well known that the function f{x, 8) defines a measure (probability) on a Borel 
field in Q We assume that everywhere in £ 2 , except possibly on a set whose prob- 
ability is zero for all 8 under consideration, at least one of the functions vi ,<p2 , ■ • ■ 
takes the value one. Let nioi) be the smallest integer at which this occurs 
Thus nioi) is a chance variable. 

In statistical applications the chance variable nioi) may be interpreted as a 
rule for terminating a sequence of obseiwations on the chance variable X, the 
probability of termination being one, and the decision to terminate depending 
only upon the observations obtained A sequential test is an example of this 
procedure The converse is, however, not true, because the process described 
above does not require that any statistical decision should be reached when the 
process of drawing observations is terminated 

An "estimate” of 0 is a function 0 *(xi , • • , x„) of the observations xi , • • ■ ,x„ 
(those obtained prior to the "termination” of the process of drawing observa- 
tions) In the sequel we shall limit ourselves to estimates whose second moments 
are finite. The estimate is “unbiased” if Ed*, the expected value of 9 *, is 6. 
When this is not so E8* — 9 is called the bias, h( 9 ), of 8* In general the bias 
is a function of 9 . It is obvious that the function 6* may be undefined on a set 
of points (xi , • • , x„) whose probability is zero for all 9 under consideration. 
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In. the present paper we shall be concerned with an upper bound on the effi- 
ciency of a sequential estimate, or, more precisely, with a lower bound on its 
variance. This lower bound is intimately related to certain results on the effi- 
ciency of the maximum likelihood estimate from a sample of fixed size, This is 
not surprising since fixed-size sampling is a special instance of sequential sam- 
pling The results obtained in this paper arc also obviously and intimately 
related to those due to Gram6r [4] and those described by him in [2], pp. 477-488. 
Naturally the conditions of regularity (restrictions on J{x, 6), 6*, etc) under 
which the results are proved are different For example, no restrictions on the 
sequential sampling procedure need appear in the statement of a theorem which 
deals only with samples of fixed size 

The argument below proceeds as if f(x, 6) were a probability density function 
The results apply equally well to the case where f{x, 6) is the probability function 
of a discrete chance variable provided: 

1) , Integration is replaced by summation wherever this is obviously required. 

2) . The phrase “almost all points” in a Euclidean space of any finite dimen- 
sionality is understood 

a) . as all points m the space with the possible exception of a set of Lebesgue 
measure zeio, when/(i, B) is a probability density function 

b) -: as all points in the space with the possible exception of points one of whose 
coordinates is a member of the set Z, when /(a;, 6) is the probability function of a 
discrete chance variable. The set Z consists of all points z such that/(z, 6) = 0 
identically for all 6 under consideration. 


3. Conditions of regularity. In this section we shall formulate the restrictions 
which we impose on /, the estimates, and the sequential process. They are 
intended to be such as will be satisfied in most cases of statistical interest. No 
doubt they can be weakened, but the author has decided against attempting to 
do so here. The list may seem long for two reasons. Seldom in the literature 
are the assumptions which, for example, lead to validation of differentiation 
under the integral sign etc., formulated explicitly. The presence of a sequential 
procedure means that additional restrictions must be imposed. 

In this section we assume that e is a single parameter. The case where 0 has 
more than one component is treated later. 

(3.1) . The ‘parameter d lies in an open interval D of the real line. D may consist 
of the entire line or of an entire half -line. 

(3.2) . The deri,vaHve~ exists for all B in D and almost all x. We define 


d log f(x, e) 


dd 


as zero ■ 


whenever fix, 9) = 0; thus is defined for all 6 in D and 

dd 

almost all X. We postulate that E ^ = 0 and that E 

do \ oB / 

be not zero for all Bin D. 
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(3.3). 



d log / (a;. , e) 
dd 



exists for all 6 in D. 

(3.4). Let B, , (j = 1, 2, ■ ■ ■ ), be the set of points {xi , 
sional Euclidean space such that 


(fifxi , ■ • ,x,) = 0 


<p,{xi , ■ • • , a;,) = 1. 


■ , X,) in the j-dimen- 


i = 1 . 2 , 


■ ,j - I 


For any integral j there exists a non-negative L-measurdble function T,(xi , • • • , x,) 
such that 


a). 


e*(xi, ■■■ ,Xi)~'n.f(xi,B) 

da 1=1 


< T,ixx, 


for all dinD and almost all (xi , ■ ■ ■ , xf) in Rj 


X,) 


b). / Tj(xi , ■ • • yX,) dxi • • • dxy 

is finite. 

(3.5). Let 

1,(6) = f 6*(xi, ,x,) JJfiXy , 6) dXy , 

•'R, t-1 

We postulate the uniform convergence of the series 


O' = 1, 2, ...). 


, dd 


(the existence of 


dtj(e) 

dd 


IS a consequence of Assumption (3.4)) for all 6 in D. 


4. The case of one parameter. In this section we assume that f(a:, 6) depends 
on a single parameter d. In sections 5 and 6 we shall discuss the case when 6 
is a vector with more than one component. 

We have E ^ = 0 

dd 

by (3.2) . Define the chance variable 


Y. = t 


i-i 


dd 


By an argument almost identical with that of [1], Theorem 1, or of Theorem 7.1 
below, we have 


(4.1) 


EY„ = 0. 
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From Theorem 7.2 below we obtain 
(4.2) = BnB 

Let d* {xi, ■ ■ , Xn) be an estimate of 6 such that 

E6* = e + hie). 

Then 


(4.3) 


£ [ 6*ixi , ,x,) 9) dx^ = 6 + bid). 

1=1 •‘ h , 1=1 


Differentiation of both members of (4 3) with respect to d (Assumptions (3 4) 
and (3 5)) gives 


(4,4) 


Be'^ Y„ = i + f. 

dd 


From (4 1) it follows that (4 4) gives the covariance between 6* and 7^ . 
from (4 2) 


(4.5) Ad*) >(l + ^BnB 


Hence 


When the bias hie) is constant, for example when hid) =0 in case 0* is an 
unbiased estimate, we have from (4.6) 


(4.6) 


Ae*) > BnB 


The equality sign in (4,6) will hold if 6* may be written as 7'(0)7„ + Z"(6), 
where Z' and Z" are functions of 6. However, 0* itself should not be a function 
of 0 if our argument is to remain valid The subject is connected with the 
question of the existence of a suflBcient estimate. 

Let/(a:, 0) be defined as follows: 

fix, 0) = 0''(1 - 0)^“*, (x = 0 or 1; 0 < 0 < 1). 

Then 

d log fix, e) ^ X _ (1 - x) „ / d log / V ^ 1 

dd 0 (1 - 0) ’ \ dd J 0(1 - 0) ■ 

Suppose 0* is unbiased Then cr°(0*) > 0(1 — d)iEn)~^, a result first given by 
GirsMck [3] under unspecified regularity conditions. 

Let the functions <pi , (p 2 , • ■ be such that n(a)) is a constant. We are then 
dealing with samples of fixed size. The result (4,5) is then given in [2], p. 480, 
under different conditions of regularity. 


5. Regularity conditions for the case when 0 has more than one component. 

We suppose that 0 = (0i , • • , and that simultaneous estimates 
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Oi(xi , , Sn), • ■ , 6*(»i , • • ) components of 0 are under discussion 

In the sequel we shall limit ourselves to the case when these estimates are all 


unbiased. 

We postulate the following regularity conditions which are sufficient to validate 

section 6: . * * . 

(5 1) . Fhc covdTianoe matnx of tko estinotos 6i ‘ ‘ t 6i is noTi-singula,r fov all 
e in D (this time D is an open interval of the l-dimensional parameter space). 

(5.2). The conditions of sections are satisfied for each 0, and 0, (i = 1, ■ • , Z). 


6. The ellipsoid of concentration when 0 has more than one component. Let 

0 = (01, •• , 0f)- 


We shall first describe briefly the result of Cramdr [4] which refers to samples 
of fixed size n > 1. Let x„) be an unbiased estimate of 

^ Let II X.3 II be the non-smgular covariance matrix of the 
0*', and let || || be its inverse The “ellipsoid of concentration” in the space 

of points {ki , ■ • ■ , ki) is defined as 

(6.1) S h'‘(,h - 0.)(fcj — 0,) = Z + 2. 

1.3=1 

If a unit mass be distributed uniformly over this ellipsoid it will have the point 
(01 , • , 0j) as its center of gravity and X„- as its product of inertia about the 

corresponding axes. Cramer proves that, subject to certain regularity condi- 
tions, there is a fixed ellipsoid 

(6 2) E n»iki - 0.)(fc, - 0^) = ^ + 2 


where 


Mi-3 


= nE 


/ d log/d log A 

V de. 00, / 


which is always contained entirely within the concentration elhpsoid of any set 
of unbiased estimates The two ellipsoids coincide only under certain condi- 
tions, among which is that the 0* be jointly sufficient estimates of the 0, . 

Let us now consider the sequential procedure of this paper and postulate the 
regularity conditions of section 6. Let 

K = l|fc.y|| 


be a matnx with real elements such that | Jil | = 

ir‘ = ||fc*^|| 


1 and let 




be its inverse. Let 
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e column matrices. Suppose 


B3) 

lull = K\\9\\. 

■'hen 

6.4) 

11 0 11 = 7C-' lull 


define 


\r\\ = 






= k\\b*\\. 


From section 4 we have 


(6.5) 


EnE 


/ 3 log fix, e) V ^ 
V 3^1 / 




where the differentiation by which is obtained is performed with lAs , ■ ■ ,'/'i 

held constant. Consider the last (I — 1) rows of K as fixed and (kn ,kn, ■ ■ , ku) 
as free to vary subject only to the restriction that \K \ = 1. The left member 
of (6.5) is then a fixed quantity, while the right member is a function of the first 
row of K. The inequality (6.5) must remain valid for all admissible 
(fell , • , hi)- Hence (6.6) will remain valid if the right member of (6 5) is 
replaced by its maximum with respect to (kn , • • ■ , hi). We shall obtain this 
rv,cvirT.nm and find that (6.5) then iihplies a result about the minimal ellipsoid 

of concentration. ^ ^ 

The problem is therefore to minimize <r ) . Now 

(6 0 ) = ^y^tjhxhi- 


The family of ellipsoids in the space of (hi , ■ • , hi) 

(0_7) S Xxjkuh] = c, 

where c is a runmng parameter, has all centers located at the origin. Let 

(kii , • • , ^ii) 

be the sought-for maximizing values of (hi , • ‘ ‘ ,hi). From the definitions of 
K and K~^ we have 

( 6 . 8 ) ^ ^ 

where (fc‘\ , fc“) are constants. It follows that the minimum value 

Co of a (^f) is such that the elUpsoid 

(6.9) S hi hi hi = Co 

is tangent to the hyperplanc <6 8) at the point (fcii , ■■■ ,kii). Now the tan- 
gent plane to (6.9) at this poinfiis given by 

( 6 . 10 ) '^\^kLhi = ca. 

itJ 
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From (6.8) and (6.10) we obtain 

1 

= /C?., 

t 

= 1 . 


d log f ^ Y' , a d log/ 

^^Pl . dOi 

/ d log/ V ^ ^ log/ 

\ / j.j 

From (6 5), (6 13), (6.14), and the definition of co we conclude that 
(6.15) 

where 

( 6 . 1 ® 

We may restate (6.15) as follows. The concentration ellipsoid 

(6.17) E x”(fc. - e.)(fcy - e,) = ? + 2 

t.J 

of the unbiased estimates 6* , • • , d* always contains within itself the ellipsoid 

(6.18) E mU*. - eXk, - 9,) = 1+2 

i,3 

where the n',, are defiried by (6.1G). 

The question of the coincidence of the two ellipsoids is connected with the 
question of the existence of sufficient estimates. It may be difficult to state 
any general results about the concentration ellipsoid of biased estimates without 
postulating some relationships among the biases and/or their derivatives. 

7. On Wald’s equation and related results in sequential analysis. In sec- 
tion 4 we referred to a proof by Blackwell [1] of an equation due to Wald [6] 
which is fundamental in the Wald theory of sequential tests of statistical hypothe- 
ses Here we shall give a perhaps simpler proof of this equation, and then prove 
several new and related results of general interest for sequential analysis. 

The results of Theorems 7.2 and 7 3 below can be obtained by differentiation 
of Wald’s fundamental identity of sequential analysis ([6], [7]). However, the 


( 611 ) 

Hence 

( 6 . 12 ) 

from which 
(6.13) 

We have 

(6 14) 
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conditions under which we obtain these results are less stringent than any so far 
found sufficient to establish the identity and the validity of differentiating it. 
Theorem 7.4 and its corollaries refer to sequential processes where the chance 
variables may have different distributions or even be dependent. In the future 
we hope to return to the question of finding all central moments of Zn , the 
problem of generalizing the fundamental identity, and related questions. 

For Theorems 7 1 , 7.2, and 7.3 wc shall assume a chance variable X whose 
cumulative distribution function F{x) is subject only to whatever restrictions 
may be explicitly imposed on it in each theorem We assume the existence of a 
general sequential process such as is described above, which is subject only to 
such restrictions as may be explicitly formulated in each theorem. The sequen- 
tial process of course defines the chance variable n. Let , Xi , ■ ■ ■ be sue- 

•n 

cessive independent observations on X. We define Zn = ^ x, . If EiX) and 

a{X) exist we shall denote them by iv and , respectively. 

Thboeem 7.1 (Wald [5], Blackwell [1]) . Suppose w and En exist. Then 

(7 1) E(Zn - nw) = 0. 

The following theorem, which is a sort of partial converse of Theoiem 7 1, is 
proved concomitantly with Theorem 7.1: 

Theorem7.1.1. If EZn exists, and if either P[X > 0} = 0orP{X < 0) = 0, 
then w and En both exist, and 

EZ„ — v)En. 

Actually the same proof suffices for a somewhat stronger form of Theo- 
rem 7.1 1 : 

Theorem 7.1.2, If EZn exists, and if 

E(X, 1 n = j) > 0 (or < 0) 

for all positive integral j such that P {n = j} 0, and all i < j, then w and En 
both exist, and 

EZ„ = wEn 

Theorem 7.2. If E | a:< — to | ^ exists, then o' and En both exist, and 

(7.2) E{Zn - nwf = a^En . 

We have 

E{Z„ — nw) = E (j^ Xx, — w)^ = 2 f n dF(x,) 

\l-l / )=1 Jb, \i_1 / 1-1 

-ttf 

jhI *' Ji f fnal 


(7.3) 
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Also 

“ OO n Tnsa-t 

(7 4) . 2 / (x, - iu) n dF{x„) = P{n > j\E(x, — w) = 0. 

ij 7n=l 

Hence ' 

(7.5) 


2 12 f (j:j - «>) n dF(x„) = 0. 

t=7 ‘'Kt 


From this (7.1) follows. 

Suppose now that the conditions of Theorem 7 2 are fulfilled. We have 

03 J S. 2 7n=} 

E{Zn - nwf = E ( E (a:. - m) ) n dF^xJ 

3«=1 \l— »I / Weal 

« «0 ^ 

(7 6 ) = E E (.<-•, - n dF(av) 

3*Bil laaj TJtol 

+ 2 E E E f (ais - w)(aj — «i) n dF(x„). 

J»2 fla<l W!*.! 

Let s < j be any two positive integers Then 

(7.7) E / (a:. - u>)(x, - to) H dF(x„) = 0. 

1-7 m-1 


Hence 


JM 7—1 00 ^ m— t 

(7.8) E 12 E / (»» — w>)(a:, — w) II dF(ai„) = 0 

;-i2 «=1 V— y 7no»l 

In a similar manner we obtain 


(7.9) E f (a:, - «))“' n dF(x„) = o-“P{n > j) . 

From (7.6), (7.8), and (7,9) it therefore follows that 

(7.10) E(Z„ - nwf = <7^* 12 P{n > j] = iP{n = j] = <r^En 

7—1 J— I' 

which is the desired result. 

It remains to prove the validity of rearranging the series in (7.3) and (7.6) 
First, we have 

(7.11) , E/ \xj- w\lldF(x.,) = P{n>j\E\X - w\. 

‘ *-3 ■'fi. 711=1 . , 
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Hence it follows that 

ca oo /* mmi ^ 

2 Z) / I a:; - M I n dFix,^) ^ P[ n > j}E \ X - 

3-1 1-7 ‘^R{ m— 1 ^ ^ — ./J I 


V) I 


(7.12) 


= .B I X - w I £; jp[n = il = I X - W) I Bn. 

i~.i 

This justifies the rearrangement of terms in the series m (7.3). Second, the 
series (7.6) is dominated by the series 


(7.13) 


oa w (* t rtag t 

Z Z / (*) - 1")” n dBC-Tm) 

4 — 1 l—J TTV— I 


+ 2Z^Zf |a;, — u;|-|a;, — tr|n dF{xn) 

jaeiS 0—1 lod)' ITIboI 

all of whose terms are positive. The series (7.13) converges because 

(7.14) 7i’ I .i\ - M 1^ < + » . 

Hence the rearrangement of llie aeries (7.(i) is valid. 

In the sequel we rcciuire certain acla It^(j = 1,2, • • ■ ) whieh we shall define 
now. Let Rt{,i ^ j, be the totalily of all points (xi , • • • , xy) such that 

(7.16) (•«•! , ’ • • , x<) e B, . 

Let be the j-dimcnsionnl Kuclich'un space. Then 

(7.16) li', ^ IP ~ i Htt . 

ami 

We shall now' prove; 

Theohem 7.3. iS'upposc lhn( Bj^ZiJ'i ~ m 1 J Bnj^Zl®* ~ I J 

emi^ Then 

(7.17) EiZ„ - ni/O’ u-sBn +- 3£r’Bn(5;„ - nw) 
where 

«>4 -J K(X «')’ 


’The author haasncatreiled in jir«viiig llmi ilu* cxibUmu’c of E 


the existence of 
neotionwith other results. 


' " 1’ 

2 I “ I ‘ 

,t«l J 

B ^ I "• M,< ij. Th.« priHif will Im published subsequently 


implioB 


in con- 
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Proof : We have 


£(Z„ - nwf = S f [Z (a:, -«')'] n dF{x,n) 

= f:f ii{x,-v}fiidF{x^) 

2=1 */B; t=l m— 1 

(7.18) + 3 2 f Z Z (a:. - w){xi - wf 11 dF{x,„) 

]b2 R j i»2 771=1 

+ 3^1 Z Z - w)\a:, - w) n 

2=2 1=2 Scsl fTlal 

+ 6 z f Z Z Z (a;i - wjjCa;. - - w) n di^Crm). 

j=3 Jb, 1-3 <— 2 1 m—1 

Considering the first term in the right member of (7.18), it follows that 

Z f TZCi:. - r«)^llld^(^m) 

2=1 “Bj Li"*! J »n=l 

, 7 ,QS = ZZ f (a:. - w)’n dF(x„) 

\t •l.tjJ 1=1 2=1 7n=l 

eo 

= '^WiP[n > i\ 

4-1 

V 

= '^iWiP[n = i] = WaEn. 

1-1 

All the rearrangements of terms in the operations involved in the proof of Theo- 
rem 7 3 are legitimate because the various series are absolutely convergent. 

As for the second term in the right member of (7.18), we have 

Z f Z Z (^» ~ '“’)(a;, - wf H dF{x,n) 

2=2 ''Rj iW2 «=! m=l 

ncxw = Z Z Z f (*. - w){xx - w)-n dFixJ 

\l •^j) Coal 4=8 + 1 2 = 9 *'^2 m— 1 

= ti’ Z Z f (a;. - w) n dFixJ 

a=l t«fl+l il 1 7rt=l 

(x, — w)II dF(Xm). 

8 = 1 4 = 8 m=l 

We now operate on En(Z„ — nw), and obtain 

En(Zn - nw) = ^ f j Z) (x, - w) 11 dF(x^) 

2=1 *'Sj 4=1 7n=l 


(7.21) 


= zzf i(Xj — w)ri dF(x„). 

J=1 4—2 •'Bi frt— 1 



EFFICIENCY OF SEQUENTIAL ESTIMATES 


227 


We observe that 

oQ f % 

X) / Hx, - to) n <lF(xJ 

1B>7 JR^ meal 

oo r t 

( 7 . 22 ) = j X I (a:, - w) n dF(x„) 

' «-3 m-1 

+ S ^ f (a;, — w) n dF{x„). 

t-aS m«al 

To evaluate the left member of (7 .22) , we proceed as follows : It is easy to see that 

(7.23) S f (a:, - w) n dF(x^) = 0. 

^ 1— j Jk, ~.i 

Moreover, when s > j, 

(7.24) 2 f (a-’j — w) n <i^(a;m) = f (a^j - la) H dF(x„) 

' Jh< »i-»i as/,_j „,_i 

Hence 

(7.25) 2 f i - 'ia') n dF{x„) = S / (a:, - aa) H dF(x„) 

'' t— j'lfii >"“1 a®'. 

Therefore 

oo 00 f* » 

(7.26) En(Z^ - nta) = E E , (a?/ - aa) 11 

' ' JbI fl—ij »'«, m— 1 

It remains now to consider the third term of the right member of (7.18). 
We have 

if ii(x. — wf (a;. — la) II dF(xJ. 

J-.2 Jsf 1—2 •— 1 ""“I 

= E E E f (a:- - ia)“ (a;. - la) H dFQc^). 

j-i j-«+i j-i aft, m-i 

Now, suppose that in the expression 

(7.28) Ki; = / (a:. — la)® (x< - ia)n dF(a;m) 

where j > ■i > s, we integrate with respect to all a;m for which m "> i. Then 
it IS not difficult to see that 

(7.29) iv,„ = 0 

7— » 

for all 5 and ^ such that 1 < s <! Hence from (7.27) 

(7.30) if H{x. - wf (xi - w) fl dF(x„.) = 0. 

^-2 Jbi t-2 •-! ”*-* 


(7.27) 
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In a similar way it is shown that the fourth term of the right member of (7 18) 
is zero. 

The desired result (717) is a direct consequence of (7.18), (7-19), (7.20), 
(7.26), and (7 30). 

Consider now an infinite sequence of chance variables Xi , , which 

need not have the same distribution and which may be dependent (in which 
case they must satisfy the obvious consistency relationships). We take sue- 
oessive observations on these chance variables and define a sequential process 
as above, which is subject only to such restrictions as we shall explicitly state. 
Let maintain its previous definition. 

Theobem 7.4. Suppose that 

(7.31) Vi = EiX, I n > r) 

exists for all positive integral i for which P {n > r} 0. In those cases write 

(7.32) Vi = Ei\X,- v.\\n>i). 

Suppose also that the series 

(7.33) (ri + ••• + v't)P{n — i] 

‘ 1-1 

converges. Then 



It is regrettable but unavoidable that the mean values r, and v'l entering into 

(7 .33) and (7.34) be conditional. The fundamental reason is that the sequential 
process may drastically modify the distribution of dependent chance variables, 
so that their distribution for our purposes can only be considered in conjunction 
with the sequential process itself. Consider the following example 

P(Zi'= -1) = i P\Xi = 1} = i 

P{Z2 2 I Zi = -1} = ^ 

P{X, = -1 I = -1} = i 

P{X2 = 1 I Xi = 1} = I 

P{X, = 2 I Xi = 1) = 1. 

We have -B(Xj) = 0. Suppose we define the following sequential process: 

~ ^ if Xi = 1, n = 2. It IS then clear that for our purposes 

Xi ban take no negative values and the fact that E{Xf) = 0 is of no use to us. 
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If, however, the chance variables Xi , Xi , ■ are independent, this difficulty 
disappears, and we have the following 

Corollahy 1 TO Theorem V 4 If ihe chance variables Xi , X 2 , • • are inde- 
-pendent, we have Theorem 7 4 with = B(X,), and v[ = E \ X, — Vt\ . 

If further all the X. have the same distribution, we see that Theorem 7.1 is 
a special case of Theorem 7.4, since the convergence of the series (7.33) is then 
a consequence of the existence of w and En. From this aigument we see, how- 
ever, that it is not necessary that all the have the same distribution, and we 
may write the following generalization of Theorem 7.1; 

Corollary 2 to Theorem 7 4 Let the X, be independent with, in general, 
different disinbuhons Suppose, however, that all v, are equal, and all v'i are equal, 
except perhaps for those i such that P {n'> t\ =0. Suppose further that En exists. 
Then (7.1) holds. 

Among possible fields of application of Theorem 7.4 axe sequential tests of 
composite statistical hypotheses, and the random walk of a particle governed 
by probability distributions which are functions of time and the position of the 
particle. The extension of this theorem to vector chance variables is straight- 
forward. The extension to higher moments may present difficulties We hope 
to return to some of these questions in the future. 

' Proof of Theorem 7 4. This is very elementary. We have 

e(z„ - = E f [i: (a;. - koI dF{xi, ■ ■ ,x,) 

(7.35) = 2 S f (x, - V,) dF(xi , ■■■ ,x,). 

yml ^Ba7 

= Er(n > .l]E(X, - Vi\n> 3 ) = 0. 

(=1 


The rearrangement of the series is valid because 


SS / \x, — V, \ dF{xi, ■ •,».)= Y^v,P{n > j] 

,_i .=j Jr, )=1 

(7 36) 

= S (ri + • • ■ + v'f)P{n = i) 


which converges by (7.33). 
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ESTIMATION OF LINEAR FUNCTIONS OF CELL PROPORTIONS 

By John H. Smith 
Bureau 0/ Labor Staiistics 

Summary. In this article certam contributions aie made to the theorj^ of 
estimating linear functions of cell proportions in connection with the methods 
of (1) least squares, (2) mimmiim chi-square, and (3) maximum likelihood. 
Distinctions among these three methods made by previous writers arise out of 
( 1 ) confusion concerning theoretical vs. practical weights, (2) neglect of effects 
of correlation between sampling errors, and (3) disagreement concermng methods 
of minimization. Throughout the paper the equivalence of these three methods 
from a practical point of view has been emphasized in order to facilitate the 
integration and adaptation of existing statistical techniques. To this end: 

1. The method of least squares as derived by Gauss in 1821-23 [6, pp. 224- 
228] in which weights m theory are chosen so as to noinimize sampling variances 
IS herein called the ideal method of least squares and the theoretical estimates 
are called ideal linear estimates This approach avoids confusion between 
practical approximations and theoretical exact weights 

2 The ideal method of least squares is applied to uncorrelated linear func- 
tions of correlated sample frequencies to determine the appropriate quantity 
to mimrnize in order to derive ideal linear estimates in sample-frequency prob- 
lems This approach leads to a sum of squares of standardized uncorrelated 
linear functions of sampling errors in which statistics are to be substituted in 
numerators 

3. A new elementary method is used to reduce the sum of squares in (2) — 
before substitution of statistics — ^to Pearson’s expression for chi-square In 
this result, obtained without approximation, appropriate substitution of sta- 
tistics shows that the denominators of chi-square should be treated as constant 
parameters in the differentiation process in order to imnimize chi-square in 
conformity with the ideal method of least squares 

4. The ideal method of minimum chi-square, derived in (3) as the sample- 
frequency form of the ideal method of least squares, yields ideal linear estimates 
in terms of the unknown parameters in the denominators of chi-square When 
these parameters are estimated by successive approximations in such a way as 
to be consistent with statistics based on them, it is shown that the method of 
mimmum chi-square leads to maximum likelihood statistics 

5. An iterative method which converges to maximum likelihood estimates is 
developed for the case in which observations are cross-classified and first order 
totals are known In comparison with Deming’s asymptotically efficient 
statistics, it is shown that, in a certain sense, maximum likelihood statistics 
are superior for any given value of n — especially in small samples. 

6 The method of proportional distnbution of marginal adjustments is de- 

231 



232 


JOHN H, SMITH 


veloped. This method yields estimates of expected cell frequencies whose 
efficiency is 100 per cent when universe cell frequencies axe proportional — a 
condition closely approximated in most practical surveys for which first order 
totals are available from complete censuses. Whether tliis favorable condition 
is satisfied or not, the method yields results which are easy to interpret and it 
has many computational advantages from the point of view of economy of time 
and effort. 

Throughout the article discussion is confined to the estimation of parameters 
whose relationships to cell proportions are linear. However, most of the results 
can be extended to the case of non-linear relationships, the necessary qualifica- 
tions being similar to those in curve-fitting problems when the function to he 
fitted is not linear in its parameters. In this case, of course, least squares esti- 
mates are not linear estimates. In particular, obvious extensions of the general 
proofs in sections 5 and 6 make them applicable to the non-linear case. Thus 
even when relationships are non-linear, it can be shown that the method of 
imnimum chi-square is the sample-frequency form of the method of least squares 
which leads (by means of appropriate successive approximations) to maximum 
likelihood statistics in sample-frequency problems. This principle which 
establishes the equivalence of the methods of least squares, minimum chi-square, 
and maximum likelihood greatly facilitates the integration and adaptation of 
existing techniques developed in connection with these important methods of 
estimation. 

1. Introduction. This article deals with problems of statistical estimation in 
which the parameters to be estimated are cell proportions or linear functions of 
them A simple illustration of this type of problem is that of estimating p, 
the proportion of white men in a population classified by race and sex. Fom 
a sample of n persons selected at random from such a population, the desired 
proportion can be estimated by simply taking the sample proportion of white 
men as an estimate of the corresponding cell proportion in the population or 
universe. This estimate is unbiased for all possible values of p and its sampling 
variance is p(l — p)/n — assuming, for simplicity, that sampling is done with 
replacements. Whether a more accurate unbiased estimate of p can be derived 
depends on whether or not any other relevant information concerning the cell 
proportions in the universe is available For example, it may be known that 
all of the white portion of the population is composed of married couples so that ‘ 
in the universe the number of white men is exactly equal to the number of white 
women. This knowledge implies that half the proportion of whites provides an 
unbiased estimate of p which is far more accurate than the sample proportion 
of white men In fact, the sampling variance of half the proportion of whites 
IS equal to (2p)(l — 2p)/47i — less than half the sampling variance of the pro- 
portion of white men. 

The term id^al linear eshmate will be used to refer to any statistic wluch satis- 
fies the criteria of estimation implied by the foregoing discussion — that is, an 
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ideal linear esimate is any estimate winch (1) is a linear function of the sample 
observations, (2) is recognizable as unbiased by the research worker; and (3) 
has minimum sampling variance among estimates which have properties (1) 
and (2). These important criteria of estimation will now be stated in more 
technical language 

Let ni , 712 , and ria represent the number of (1) white men, (2) white women 
and (3) non-ivhite persons, respectively, in samples of n persons. Since any 
linear function with a constant term can be reduced to the homogeneous form 
by adding an appropriate multiple of the identity 

(1.1) + 712 + nj — 71 = 0, 

it IS possible, without loss of generality, to confine attention to linear estimates 
of the form 

(1.2) T = aitii T 02712 T Ostis , 

which are recognizable as unbiased. In this example, the research worker is 
assumed to know that the cell proportions in the universe are 

(1 3) Pi , p2 , Pa = p, p, 1 - 2p 

Hence, absence of bias implies that the expected value of T 

(1.4) E{T) — OiTipi + OaTipa + OaTipa 

= (Oi + 02 — 20?) Tip + 7103 

is identically equal to p , in other words, that 

(1 6) n(ai + 02 — 203 ) — 1 = 0, 

and 

7103 = 0 

The ideal linear esimate is derived by finding values of 01,02, and 03 which 
mimmize the sampling variance of T subject to equations (1 5) as side condi- 
tions.' In this way it can be shown that half the sample proportion of whites 
IS actually the ideal linear estimate of p. For more general problems, the 
process of minimization of sampling variances with the aid of Lagrange multi- 
pliers involves expressions which are complicated algebraically. For this reason 
it IS usually easier to denve ideal linear estimates of parameters which are linear 
functions of cell proportions by the ideal method of least squares which is 
presented in section 4 

Like other least squares estimates, an ideal linear estimate of a linear function 
of cell proportions depends on ideal least squares weights. Since these weights 

' la thia example, it is possible to solve equations (1.6) for Oa in terms of Oi , drop sub- 
scripts, and substitute in the formula for the sampling variance of T to obtain a quadratic 
in a to be minimized. 
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are, in general, functions of variances and covariances of sample frequencies, 
the theoretical connotation of the term “ideal” makes it preferable to other 
terms such as “optimum” and “best ” In this connection it should be em- 
phasized that (1) the sampling variance of linear estimates is insensitive to 
small errors in estimating ideal iveights, and (2) the process of deriving practical 
approximations to ideal linear estimates automatically provides maximum 
likelihood estimates of the ideal weights Thus the estimation of weights is 
perfectly objective and the best practical approximations to ideal linear esti- 
mates are expressed in terms of sample observations This degree of objec- 
tivity IS rare in statistical estimation as a brief consideration of regression prob- 
lems will illustrate 

In ordinary regression problems, the ideal weights are inversely proportional 
to error variances It is usually necessary to draw upon past experience to 
estimate relative weights because satisfactory estimates of error variances 
are rarely available in terms of sample obseivations. From the present point 
of view, the widespread use of equal weights implies the suhjecLive “assumption” 
that all error variances are equal. (Maximum likelihood estimates of regression 
coefficients require, in addition, the even more subjective assumption of nor- 
mality ) In spite of these (usually implicit) subjective assumptions, dis- 
cussions of optimum properties of least squares regression coefficients based on 
ideal weights in terms of unknown -parameiers are highly commendable because 
(1) sampling variance is not very sensitive to small errors in weights and (2) 
properties of theoretical ideal linear estimates furnish a simple basis for dis- 
cussion of the properties of practical statistics based on any reasonably good 
approximations to the exact ideal weights In any case, it is important to 
know what the ideal weights are in terms of unknown parameters because 
research workers can make better estimates if they know what quantities should 
be estimated than they could otherwise. 

2. Estimation of a single parameter. In sample-frequency problems, least 
squares weights are rarely given explicitly or even implied by information 
available to the research worker, Since the hypothetical example used in 
Section 1 is a trivial special case from this point of view, a more realistic ex- 
ample is presented in this section. Since the biological interpretation of this 
problem is presented in detail in all but the first of the many editions of Fisher’s 
well-known book [3] it is sufficient here to consider only the statistical problem. 
The four cell proportions arc 

(2.1) Pi , P2 , P3 , P4 = (2 -b 0)/4, (1 - e)/4, (1 - 0)/4, d/4, 

and the parameter 6 is to be estimated from the set of sample frequencies 

(2 2) ni, 712, na, 714 = 1997, 906, 904, 32, 

obtained in a sample of n = 3839 selected at random from an infinite universe 
.Fisher considers five different statistics— Ti , , Ta , T 4 , and Ts~ao it will 
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be convenient to use the symbol for the ideal linear estimate Consider 
the class of lineai unbiased estimates of the form 


(2 3) 'I' — fliTii 02*^2 “b ctsfiz "h CLiiii , 

where absence of bias implies that 


(2 4) 2ai + fli + 03 = 0 

and 

oi — ffla — + a 4 — 4/n = 0. 

Minimizing the sampling variance of T m equation (2 3) subject to side 
conditions based on equations (2 4) yields the ideal linear estimate To defined 
by the equation 

(2 5) n(l + 25) To = 357 ti - ZBri-i - Sduz + (4 - 6 ) 71 ^ 

The exact sampling variance of 2’c , 


( 2 . 6 ) 


•1 ^ 25(1 - e)(2 + 8) 
71.(1 + 2d) 


is used by Fisher as the asymptotic sampling vaiiance of any efficient estimate 
of 8 The exact sampling vanance of the ideal linear estimate is especially 
appropriate as the asymiitotic sampling variance of the maximum likelihood 
estimate Ti because 2\ is the limit of an iterative process designed to estimate 
To as closely as possible from sample data by using successive approximations 
to To for 9 in equation (2.5) The limit of this process (winch is, of course, 
only an approximation to Ti) can be obtained by substituting the symbol Ti 
for both To and 6 in equation (2 5) and solving the resulting quadratic equation 
which can be reduced to 


(2.7) nTl — {ni — 2ni — 2nz — nijTi — 2n,i = 0, 

an equation which is identical, except i'oi notation, vith Fisher’s equation of 
maximum likelihood of which T 4 is the positive solution. 

The foregoing lesult is a comparatively simple illustration of the general 
principle that the maximum likelihood estimate of any linear function of cell 
proportions is the limit of an iterative process designed to approximate the 
corresponding linear estimate as closely as possible by means of sample fre- 
quencies Since the accuracy of estimates of least squares relative weights 
Increases with size of sample, maximum likelihood statistics have, m an asymp- 
totic sense for large samples, the same optimum properties which are possessed 
in an exact sense (even for small samples) by the corresponding ideal linear 
estimates. Thus the results obtained by means of the theory of large samples 
are supported by the approach to estimation problems by means of ideal linear 
estimates. In addition, the later approach facilitates the integration of 
available techniques as explained in later sections. 
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It is true that the optimum properties of maximum likelihood statistics can 
be presented in terms of the theory of large samples, but the fact that a given 
method of estimation yields a statistic whose asymptotic sampling variance is 
a mimmum does not imply that the same technique will yield a minimum 
vanance statistic for any given small value of n For example, it is well known 
that the median is a maximum likelihood estimate of the midpoint of a double 
exponential universe. Nevertheless, m samples of three observations from 
such a universe, another statistic — 4/9 of the mean plus 5/9 of the median — 
has greater relative advantage over the median than the median has over the 
mean 

Fisher’s discussion of the relative efficiencies of his five alternative consistent 
statistics suggests that it is impossible to formulate objective criteria for making 
choices among alternative statistics such that each statistic will be used ivlienever 
its saraphng variance is smallest Consider the sequence of iimverses generated 
by letting 6 vary from zero to unity. In general, each value of d would deter- 
mine which of Fisher’s five statistics would have smallest sampling variance 
for that particular um verse for any given value of n In comparison with 
any other single statistic, the statistic Ti would usually have smaller samphng 
variance, but there are notable exceptions. For example, in the absence of 
linkage when 6 is equal to one-fourth, the statistic is the ideal linear estimate 
and its sampling variance is smaller than that of Ti — at least for certain small 
values of n For this reason, Fisher used Ti in preference to Tt as the basis for 
testing the significance of linkage The statistic Ti — deiived by Fisher’s method 
of minimum chi-square — ^is also of special interest Fisher’s method of minimum 
chi-square yields statistics which differ from the corresponding maximum 
likelihood statistics because Fisher considers the denominators as variables in 
the process of differentiation instead of considering them as unknown para- 
meters to be estimated by identifying them with the corresponding statistics 
in the numerators after differentiation. Arguments of later sections tend to 
show that the latter method is more appropriate. In this example, it can be 
shown that if Ti were substituted for the corresponding parameter in the de- 
nominators of chi-square {and treated as a ■parameter) the minimization of cln- 
square with respect to statistics in its numerators w'ould be exactly equivalent 
to substituting 0.035785, the numerical value of for 6 in equation (2.5) and 
solving for Te to obtain 0.035717, a value wliich is much closer to 0.036712, 
the numericahvalue of the maximum likelihood estimate Tn than to Fisher’s Ti . 
In problems of estimation chi-square should be minimized in order to obtain 
efficient statistics — ^not to obtain a small criterion for testing goodness of fit — 
and it should be minimized in a manner consistent with this purpose. Whether 
or not it is possible to derive an even smaller value for a quantity called chi- 
square should be considered to be irrelevant in either estimation problems or 
tests of sigmficance. It is difficult to present these ideas in more technical 
language because it is possible to construct triidal hypothetical universes for 
which Fisher’s method of minimum chi-square provides statistics which are 
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superior in certain respects to the corresponding maxnnum likelihood statistics. 
Nevertheless, it seems clear that the ideal linear estimate usually has smaller 
sampling variance than the maximum likelihood statistic which, in turn, usually 
has smaller sampling variance than any other given practical statistic. Evi- 
dence presented in later sections tends to show that these advantages are more 
important in small samples than in cases in which the theory of large samples 
is applicable. 

3. The “ideal” method of least squares. When sample observations are 
uncorrelated in successive samples and parameters to be estimated are linear 
functions of the expected values of the sample observations, the method of least 
squares yields ideal linear estimates of the parametes provided that the weight of 
each observation is inversely proportional to its variance in successive samples. 
Although the minimum sampling variance property among linear unbiased 
estimates is seldom stressed, this principle of weighting has been presented in 
connection with the method of least squares for more than a hundred years. 
In order to emphasize the theoretical nature of iveights which depend on vari- 
ances winch are usually unknown m practice and to distinguish the method 
based on such iveights from the more familiar method of least squares with 
equal weights, the method which yields ideal linear estimates will be called the 
ideal method of least squares 

Discussion of the general problem of estimating linear functions of cell pro- 
portions can be facilitated by making use of results obtained by other writers — 
notably Gauss (as reported by Whittaker and Robinson [6]) and Pearson [4], 
According to Whittaker and Robinson, “the first writer to connect the method 
[of ideal least squares] with the theory of probability was Gauss” [6, p. 224]. 
In his Theoria Motus proof of 1809, Gauss derived the “most probable value” 
[6, p. 223] of a parameter (i e , the statistic which satisfies the criterion now 
called maximum likelihood) for the case in which sample observations are sta- 
tistically independent and normally distributed. In his Theoria Combinationis 
proof of 1821-23, Gauss “abandoned the ‘metaphysical’ basis” [6, p. 220] of 
his earlier work and derived the method herein called the ideal method of least 
squares (without approximation) from the criteria of (1) mimmum variance and 
(2) absence of bias for the case in which “the mean value of [the covariance of 
a pair of errors] is zero” [6, p. 224] Since the covariances of uncorrelated linear 
functions are zero whether they arc statistically independent or not, it follows 
from the woik of Gauss that the ideal method of least squares applied to un- 
correlated linear functions of sample frequencies yields ideal linear estimates. 
In other words, the ideal method of least squares implies the following six steps; 

1 From the set of 1 sample frequencies construct k linear functions 
which are uncorrelated in successive samples 

2 From each function subtract its expected value in terms of the unknown 
paiameters to find its sampling error. 
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3 Write the ratio of each sampling, error to its own standard error in the 
form of a fraction. 

4. Sum the squares of these standardized uncorrelated sampling errors to 
obtain a quantity called clu-square 

5 Substitute statistics® for the parameters in the numerators of chi-square, 

6. Minimize the sum of squares of residuals nnth respect to each statistic 
in turn (subject to appropriate side conditions in case linear functions 
not implied in. preceding steps arc Icnown) . 

This senes of six steps can be summarized by the single statement that the 
function' to minimize is the sum of squares of standardized uncorrclated resid- 
uals. Actually this statement is ovei simplified because even though sampling 
errors are both uncorrelated and standardized, the corresponding residuals 
are, in general, neither standardized nor uneorrelated. 

4. Pearson’s expression for chi-square. As defined by Pearson [4], chi- 
square is the sum of squares of a set of k standardized uncorrelated linear func- 
tions of sampling errors in a set of 7c + 1 correlated sample frequencies. A set 
of k standaidized uncorrelated linear functions can be constructed in an infinite 
number of ways, but each set can be obtained from any of the otheis by means 
of an orthogonal transformation. Thus the sum of squares is the same no 
matter what set is originally chosen. As his set of standardized uncorrelated 
linear functions, Pearson chose those determined by the axes of the correlation 
ellipse for which he gave the required sum of squares in terms of “minors” or 
oofaotors of the correlation determinant of the first k sample frequencies Pear- 
son reduced this complicated expression to tire now familiar form 

r-Hi 

(4.1) = S (n, - npiY/np,, 

where pi is the proportion m the ith cell in the universe and n, is the frequency 
in the ith cell of a sample of n observations selected at i-andora from an infinite 
universe (or with replacements from a finite universe) . 

The widespread misunderstanding of the nature of chi-square seems to he 
based primarily on the facts that 

1 Pearson’s rule for degrees of freedom is inadequate (see section 5) , and 

2 Pearson’s expression for clii-sqiiarc can be derived by approximate methods 
as well as by exact methods, 

Pearson’s derivation of the expression for chi-square by exact methods is suf- 
ficient to show that its derivation by approximate methods involves a paradox 
in which different sets of approximations offset each other, however, Pearson’s 
article is relatively inaccessible and, in addition, his algabraic reductions involve 

* It IS convenient to call these variable symbols "statistics”, the quantities whose 
squares are summed, "residuals”, and the whole expression “chi-square,” even though, 
from a certain point of view, these terms are strictly applicable only after the mimimza- 
tion process. This usage should always be clear from its context. 
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the minors of a general determinant of the fcth order. For these reasons, the 
following exact derivation is presented in terms of elementary algebra. 

Since the sum of squares is the same for any set of h standardized uncorrelated 
linear functions of the sampling errors in fc + 1 correlated frequencies, a set should 
be chosen for which the algebraic reductions are as easy as possible. From this 
point of view a satisfactory set, which can be written in any of three forms, is 
given by 

(4.2) Vi ~ 


= -p.e._ - (p. -H p,+)e. 


where e, = — npi and i+ and refer to classes formed by combining all 

classes above the 4th class and below the fth class, respectively. 

By means of the known variances and covariances of the sample frequencies 
in expected value form, 

(4.3) Eiel) = np.(l - p,), 
and 

(4.4) Eie,e,) = -np.Pi , 
it can be shown that the variance of 2 /, is 

(4 5) Eiy^) = np,p,+(p. + p.+), 

and, by using the third expression in equation (4 2) for y, and the second for 
y, , it can be shown that any pair of p’s are uncorrelated because 

(4 6) Eiy,y,) =0, < j)- 

Let 3 , represent the variable p, expressed in standard-deviation units. The 
square of this standardized uncorrelated linear function of correlated sampling 
errors can be written 

‘ np,pt+(p, -b p.+) ■ 

It remains to show that Pearson’s expression for chi-square can be obtained 
by adding the fc values of 2 “ in succession For this purpose it is convenient 
to define 


(4.8) 


2 

Xr 


^ ^ er+ 

npi npr^ 


obtained by combining all classes above the rth class. 

When r = fc, the expression in equation (4.8) is the expression to be derived. 
It remains to show that xl is the sum of squares of fc standardized uncorrelated 
linear functions of sampling errors; le , 
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(4.9) Xfc = 2 z! ■ 

For the first cell ei+ = — 0 iandpi+ = 1 — pi . Hence j/i reduces to the negative 
of the error in the first frequency and 

(4 10 ) xl = el/np,(l - pi) 

= ei/npi + el+/npi+ (pi+ = 1 - pi), 

a special case expressed in the required form. The general case is established 
by showing that 

(4.11) xl~i + = xl , 


or, alternatively, that 


3 11 

Zf = Xr - Xr-l 

= ellnpr + el^.jnpr+ — (e, + <‘-r+f/n(pr + Pt-\) 

^ + Vrel+){p, + pr+) - PrPr+(el + 2er Ch- + el+) 

np,Prh(Pr + P,+) 

_ plaU ~ 2prPr+ere,+ + pl+el = {prif^ - Pr+er)° 
nPrPr+iVr + Prt) 


thus establishing the derivation of Pearson’s expiession for chi-square. 

When sampling is done without replacement each variance and covariance 
is multiplied by {N - n)/{N — 1) where N is the number of observations in 
the universe Hence, ohi-square for this case can be written 


(4.13) 


2 

X 


N — np^ ' 


This expression shows that the factor involving sampling errors is the same 
whether sampling is done with replacement or without replacement. Hence, 
the derivation of least squares statistics is the same for either method of sampling, 
but samplmg variances for the simpler case are multiplied by the factor (N — n)/ 
(N — 1) when samplmg is done without replacement. 


6. The method of minimum chi-square. The derivation of Pearson's ex- 
pression for chi-square completes first four steps of the ideal method of least 
squares outlined in section 3. Hence, the method of minimum chi-square is 
the sample-frequency form of the ideal method of least squares in which only 
two of the six steps remain to be taken. 

In his original article [4] Pearson pointed out that the use of statistics instead 
of parameters would affect the value of chi-square but that such effects would 
usually be so small that no allowance need be made for them in connection with 
tests of significance. It is now well known that the average value of chi-square 
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IS reduced approx mately one unit for each parameter estimated from the sample, 
and that the mam portion of this effect is on the numerators ; i.e., in large samples 
the effect of substituting statistics for parameters in the denominators usually 
has a negligible effect on the value of chi-squaie By confining the discussion 
to the case in which parameteis are used in the denominators, it is possible to 
make simple exact statements concermng the main effects in terms of the number 
of squares of standardized uncori elated linear functions — also known as the 
number of degrees of freedom and the mean value of chi-square 

When the expected values in the numerators of chi-square can be expressed 
as linear functions of r algebraically independent parameters, ideal linear esti- 
mates of the r parameters are determined by substituting statistics for the r 
parameters and minimizing the resulting expression wth respect to each sta- 
tistic. In general, such a substitution of statistics for parameters in the numer- 
ators of chi-square reduces the number of degrees of freedom by one umt for 
every parameter estimated; that is, the appropriately minimized chi-square 
can be analyzed into k — r squares of standardized uncorrelated linear functions 
of sampling errors 

The )' ideal linear estimates are linear functions of the sample frequencies. 
Let (wi , 1 ) 2 , , V,) be a set of standardized uncorrelated linear functions of 

the correlated sampling errors in these statistics and let {v\ ,Vz, •, vi) be a set . 
of linear functions obtained from the z.’s of section 3 by an orthogonal trans- 
formation. Since the sum of squaies is not changed by such a tiansformation, 
chi-square is the sum of the /c values of v\ . The process of substituting statis- 
tics for the r parameters in the numerators of chi-square reduces the values of 
the first rv\’a to zero without a.ffecting the values of the other (Jo — r)r^,’s. 

Thus the appropriately mimmized chi-square can be analyzed into k — r 
squares of standardized uncorrelated linear functions of sampling errors and is 
therefore said to have k — r degrees of freedom. The mean value of each square 
is the variance of a standardized linear function of sampling errors and is there- 
fore unity by definition Hence the mean value of the appropriately mimmized 
ohi-square (with parameters in the denominators) is exactly k — r when r 
statistics are estimated from a set of fc -f- 1 sample frequencies 

The expression to be minimized is 

( 51 ) 

np, 

whore ml is the ideal linear estimate of wp, . The set of sta,tisticB described 
by the equation 

( 5 , 2 ) m'i = Tii , 

reduces the value of chi-square to zero — ^its mimmum value This shows that 
the sample cell proportion is the ideal linear estimate of the corresponding 
parameter 

Whenever a linear function independent of the sum of the cell proportions is 
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known, it is possible to take advantage of additional information provided by 
the known function by minimizing chi-squarc subject to an appropriate side 
condition When side conditions are used in this way, the number of degrees 
of freedom for the minimized chi-square is equal to the number of side conditions 
which are algebraically independent of each other (and of the sum of the cell 
proportions) . Let the known linear function be written 

(5 3) Suinp, — m = 0. 

In order to facilitate comparison of the typical equation of maximization 
with the corresponding equation of the method of maximum likelihood, it is 
convement to minimize chi-square by maximizing — x^/2 subject to a side 
condition based on (5.3). The function to be maximized can be wntten 

(5 4) — xV2 = 2(n. — m()V(— 2np») + hCZa.ml — m), 

where h is a Lagrange multiplier. Setting the partial derivative of — xV2 
with respect to m'i equal to zero, the typical equation for imnimizing chi-square 
can be written 

(5 5) (n. — m()/np, -f ha, = 0, 

a form which shows that, in general, ideal linear estimates are defined in terms 
of unknown parameters. Fortunately, these parameters can usually be approxi- 
mated closely by an iterative process. Substituting to,' for both np, and m[ 
in equations (6.5) the typical equation in the limiting values of such a process 
can be reduced to 

(5.6) n,/m( — 1 -f iio, = 0, 

a form which is identical mth the typical equation (6 6) of maximum likelihood 
derived in section 6. This equality of typical equations implies that whenever 
the denominators of chi-square are estimated in such a way as to be consistent 
with least squares statistics based on them, the method of minimum chi-square 
always leads (by means of approximations necessary in practice) to maximum 
likelihooql estimates of parameters which are linear functions of cell proportions 

6. The method of maximum likelihood. Maximum likelihood estimates of 
linear functions of cell proportions can be obtained by (1) expressing the prob- 
ability function (general term of the multinomial expansion) in terms of the r 
parameters to be estimated; (2) substituting r statistics for the r parameters; 
and (3) maximizing with respect to the r statistics In practice, tliis is usually 
accomplished by maximizing the logarithm of the variable factor in step (3) 
which can be written, 

(fid) L = Sn.logTO, , 

where to, is the maximum likelihood estimate of np, , the expected value of the 
ith frequency n, in a sample of n observations classified into (fc -f- 1) classes or 
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cells It IS evident that L as written has no maximum with respect to any m. 
since it increases without bound asm, increases, but it sometimes has a umquely 
determined maximum when each of the mi’s is written explicitly in terms of 
less than k + 1 algebraically independent statistics In the general case it is 
easier to maximize L subject to an appropriate set of side conditions, one of 
which must be equivalent to 

(6 2) mi + mj + ■ • + m^+i — n = 0. 

When no linear function except the sum is known, the likelihood function 
can be written 

(6 3) L = Sn.logm, — (Sm, — n), 

a function which, subject to equation (6 2), is always equal to that in equation 
(6 1) but wliich has a uniquely determined maximum The typical equation of 
maximum likelihood, obtained by setting the partial derivative of L with respect 
to m, equal to zero, is 

(6 4) «.•/«. -1 = 0, 

an equation which shows that each sample frequency is a maximum likelihood 
estimate of its own expected value 

When a linear function such as that in equation (5 3) is known, an improved 
set of maximum likelihood statistics can be found by maximizing 

(6 5) L = Sn, log m, — (Sm, — n) + — m) 

The typical equation of maximization is found to be 

(6.6) n,/??!. — 1 + ha^ = 0, 

an equation which, as stated above, is identical with equation (5.5). Since 
equation (5.5) was obtained as the limit of an iterative process from the typical 
equation (5 4) for mimmizing chi-square subject to the same side condition 
and since each additional side condition affects the typical equation of each 
method m exactly the same way, the method of minimum chi-square and the 
method of maximum likelihood are equivalent for the general case in the sense 
that the method of minimum chi-square always leads to maximum likelihood 
statistics as limits of an iterative process 

7. Second-order tables with known expected marginal totals. As stated in 
section 2, the integration of available techmques is facilitated by' regarding 
maximum likelihood statistics as the -best practical approximations to the 
corresponding ideal linear estimates. Since this important principle may not 
be immediately obvious, it will be illustrated for the important special case of 
|econd-oider tables for which the expected marginal totals are known 

Consider a sample of n observations arranged on two bases of classification 
and presented in a table containing r rows and s columns. The umverse of N 
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observations has been completely enumerated and classified on each basis 
separately but not cross-classified; i.e., universe totals of first order classes are 
known. 

For the cell in the ith row and the jth column, let jj,, represent the universe 
cell proportion; Uj, , the sample frequency; irp,,-, the expected value of 
and wi.j' , the maximum likelihood estimate of npij . Indicating summation 
by substituting a dot for the letter over which summation is to be performed, 
the known marginal totals satisfy the equations 

(7.1) ■ Np.. - Af. =0, 

Np., - AT , = 0, 

where pi. and p., are the universe proportions and N, and N are the Icnown 
universe totals in the I'th row and the jth column, respectively. 

When n observations of a random sample are arranged according to two 
bases of classification in a table with r rows and s columns for which the r + s 
marginal totals are known, the typical equation of maximum likelihood can 
be obtained by maximizing, subject to side conditions based on equations (7.1), 
the likelihood function 

(7.2) L = 2Sn,^log m.j' — Sa,(m,. — n,.) — 26, (m., — n ,), 

with respect to the maximum likelihood estimates m*, , where a, and bj are typical 
Lagrange multipliers. Setting the partial derivative with respect to m,, equal 
to zero and transposing, the typical equation of maximum likelihood can be 
written 

(7 3) nijfrriij = o, + 6, . 

Since equations (7.3) are not linear in their unknowns, the reader’s first 
reaction might well bo to agree with a certain anonymous critic that “their 
solution is difficult.” This impression of great difficulty is probably the chief 
reason that previous writers have not used the method of maximum likelihood 
for this type of problem even after they had developed a set of techniques ade- 
quate for the solution of the equations of maximum likelihood. In other words, 
all that was needed was the integration of available techniques as will now 
be shown. 

In 1940, Deming and Stephan [2] derived a set of normal equations for the 
adjustment of a set of second-order cell frequencies to known expected marginal 
totals by the method of least squares in which each sample frequency is weighted 
by its own reciprocal. This method yields statistics which are efficient according 
to the theory of large samples, but they do not satisfy the criterion of maximum 
likelihood exactly. In the same artieje was presented an easier method of 
iterative proportions, which, unfortunately, does not yield least squares sta- 
tistics. In 1942, Stephan [6] developed an improved iterative process which 
yields statistics which satisfy the entenon of least squares with arbitranly 
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chosen weights. The foregoing developments are presented in greater detail 
in Deming’s book [1] in which Deming adapts Stephan’s iterative method to 
the particular case in which each sample frequency is weighted by its own 
reciprocal so as to yield solutions for the normal equations derived in the joint 
article [2], 

In Deming’s notation, equation 8 of Stephan’s article [5, p. 169] can be written 
(7.4) m.j = c„(p» + gr,- - 1) + , 

an expression obtained by substituting c.j for in the denominators of chi- 
square and mimmizing with respect to the statistics in the numerators. Hence, 
if exact values of the were used for the c^, , the Stephan iterative method 
would yield ideal linear estimates. Unless these parameters are implied by 
some hypothesis to be tested, it is necessary, in practice, to estimate the np ,3 
from sample data In order to secure maximum likelihood estimates of expected 
cell frequencies by. means of the Stephan iterative method, the adjusted fre- 
quencies based on first approximations to the Cij should be used as second ap- 
proximations to the c,j , etc. In this way, maximum hkehhood statistics can 
be derived to any desired degree of approximation. At tins point it should 
be emphasized that the preceding statement applies not only to the class of 
problems considered in this section but also to the wider class of problems for 
which the Stephan iterative method provides solutions. 

Unfortunately, theoretical discussions of previous writers contain confusing 
compensating errors which (1) present their own methods in an unnecessarily 
unfavorable light and (2) increase the difficulties involved in the introduction 
of the improvements in techniques suggested in section 9 which involve some 
degree of adaptation of techniques already available For these reasons, it 
seems necessary to follow the arguments of previous writers in order to show 
the points at which improvements are needed. This can be done most effec- 
tively in connection with Deming’s book [1] where the method of least squares 
is presented in great detail. 

For the special case in which the sampling errors in the observations are un- 
correlated, the ideal criterion of least squares implies that the weight of each 
observation should be inversely proportional to its samphng variance This 
criterion is accepted as well known by Deming who says that “the principle of 
least squares requires the minimizing of the sum of the weighted squares of the 
residuals” [1, p 14] where “the weights of tivo functions are inversely pro- 
portional to their variances” fl, p 22] Deming assumes that “there is no 
correlation between the errors in the observations” with the qualification that 
“this assumption covers a wide class of problems, but does fail to cover some ” 
[1, p 49]. This assumption of uncorrelated errors is not applicable to sample- 
frequency problems, of course, because the sample frequencies are correlated 
with each other in such a way that the reciprocals of the ideal least squares 
weights are not proportional to the sampling variances npvjg,, but rather to 
the expected frequencies rip,, which appear in the denominators of chi-square. 
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In this connection it is interesting to note that Deming himself insists that 
“there is only one principle of least squares, namely, the minimizing of ” 
[1, p 51] However, the method currently muse for the minimizing of chi-square 
was that given by Fisher [3] which loads to equations which are difficult to solve 
even for such a simple example as the one presented in section 2 above 

Deming and Stephan are to he commended for seeking an easier method 
but there is no justification (even as a device for saving effort) for their modifica- 
tion of the "principle of least squares” so as to imply erroneously that 

(1) weights of correlated sample frequencies are inversely proportional to 
their variances, and 

(2) sample frequencies are, m general, approximately proportional to their 
own sampling variances 

Strangely enough, these two errors weie applied in combination by Deming and 
Stephan to obtain good practical approximations to the ideal least squares 
weights. It might be argued that the second misleading implication is leally 
not an erroi because it is offered as a simplifying approximation, but it is an 
integral part of both the normal equations approach in the joint article [2] 
and Deming's adaptation [1] of the Stephan iterative method; that is, in each 
case the method would have to be revised if better approximations to the ideal 
least squares weights were used More explicitly, Deming (1) uses n,, for Ste- 
phan’s c,j in equation (7,4) ; (2) identifies it wdth the other n.y in the same equa- 
tion; and (3) reduces the equation to a different form thus effectively preventing 
the use of successive approximations to the c.j ivithout refurning to Stephan’s 
iterative method in the general form given by equation (7 4) above which 
Deming does not present at all. Results- of the joint article [2] are quoted by 
Stephan [5] without any explanation of the nature of the errors, but none of 
these results are used in the development of his iterative method which as noted 
above, is applicable to any arbitrarily chosen set of weights The fact that 
Stephan corrected the second error without correcting the first implies that the 
weights he actually used are unsatisfactory In Deming’s adaptation of the 
Stephan iterative method, a much better set of weights is obtained, not by cor- 
recting the first offsetting error overlooked by Stephan, but by resurrecting the 
second offsetting error which Stephan had corrected. Since this error is an 
integral part of Deming’s adaptation, Deming’s theoretical discussion implies 
that his own efficient statistics are only rough approximations which are definitely 
inferior to the inefficient statistics obtained by means of the weights chosen by 
Stephan, These inconsistencies are most clearly brought out by Deming when 
he says: 

“Strictly, in random sampling, the reciprocal of the weight of n,, is , which is 

nearly equal to nng,, where p and q have their usual connotations But since factors pro- 
portional to the weights may be substituted for them, it is sufficient to use n,, os the re- 
ciprocal of the weight in cell ij, since the values of do not usually vaiy much over the 
table,’’ [1, p 102 ] 

In any given problem the seriousness of the error in the first statement in 
the foregoing quotation depends on the variation among the q,j’s. In the par- 
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ticular example used by Deming the error is of considerable importance because 
the largest q^j is more than 40 per cent larger than the smallest qtj . The weights 
actually used by Deming agree with weights implied by the ideal method of least 
squares except for sampling errors in the n^j ; hence, the error m any relative 
weight converges stochastically to zero so that Deming’s statistics are efficient 
according to the theory of large samples The efficiency of Dermng’s statistics 
is inconsistent with the theory presented by Deming which implies erroneously 
that efficiency of estimation depends on approximate equality of cell proportions. 
If this argument were true it would apply also to the method of maximum 
likelihood and all other methods which yield efficient practical statistics in 
sample-frequency problems. The foregoing discussion, together with the results 
of section 8 show that the theory as presented by Deming has the following 
seriously misleading features: 

(1) it IS based on a paradox in which a good final result is obtained by means 
of compensating errors, 

(2) it presents lus efficient statistics m an unnecessarily unfavorable light , 

(3) it emphasizes the irrelevant condition of approximate equality of universe 
cell proportions , 

(4) it fails to mention the important condition of proportionality by rows 
and columns, and 

(5) it makes least squares, minimum chi-square, and maximum likelihood 
seem to be competing alternative methods. 

Of these undesirable characteristics, the last two are probably the most serious 
because they make the effective integration and adaptation of statistical tech- 
niques more difficult. As has been shown in sections 4, 5,. and 6, the sample- 
frequency form of the ideal method of least squares is the method of minimum 
chi-square which ahvays leads (by means of appropriate practical approxima- 
tions to unknown weights) to maximum likelihood statistics ; in other words, 
the methods are equivalent from a practical point of view 

^Since the ideal method of least squares based on the unknown determines 
fully efficient, but theoretical, ideal linear estimates, the efficiency of practical 
approximations to ideal linear estimates depends on the accuracy with which 
the denominators of chi-square are estimated. For the unknown denominators 
Jip,, , Deming uses the sample frequencies while the method of maximum 
likelihood implies the use of the corresponding maximum likelihood estimates — 
statistics which, in general, have smaller sampling variances. The foregoing 
argument suggests that maximum likelihood statistics are slightly superior to 
Deming’s statistics for any given finite value of n and that their relative ad- 
vantage increases as the sample size decreases In large samples both methods 
yield efficient statistics because the relative errors in the weights implied by 
either method converge stochastically to zero as n increases Although the ad- 
vantage of maximum likelihood statistics over Deming’s statistics is unim- 
portant except in small samples, it can be shown that Deming’s choice of weights 
leads to imperfectly compensated negative errors of estimation even in his 
large sample of 33,837 observations 
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Deming weights each sample frequency by its own recipiocal. Positive errors 
of sampling decrease the value of the reciprocal and thus increase the absolute 
size of the required negative adjustments Negative errors of sampling increase 
the value of the reciprocal and thus decrease the size of the positive adjustment. 
Thus every error of sampling (either positive or negative) leads to a negative 
error of estimation due to inappropriate weighting. Because the sum of all 
adjustments must be zero, these negative errors of estimation are compensated 
on the average but more or less imperfectly. The net effect of this imperfect 
compensation of negative errors of estimation is that Deming’s statistics are 
too small in those cells in ivliich the relative adjustments (either positive or 
negative) are large, and vice versa. In a preliminary draft of this article, 
this type of error of estimation was studied by comparing Deming’s statistics 
with the corresponding maximum likelihood statistics in collection with Deming’s 
example involving 33,837 observations Although errois of estimation of the 
type under discussion are apparent, they are, of course, extremely small m such 
a large sample. For tliis reason the large-sample comparson has been deleted 
in favor of simple hypothetical examples designed to throw light on similar errors 
of estimation in statistics derived by Fisher’s method of minimum chi-square 
as well as in those derived by Deming’s adaptation of Stephan’s iterative 
method 

Consider a set of sample frequencies in a two-by-twn table for which all 
expected marginal totals are equal. For this special case, the cell proportions 
on each diagonal are equal and the ideal linear estimate (which is also the 
maximum likehhood estimate) of any cell proportion is the mean of the two 
sample cell proportions on its diagonal For the same case, Dommg’s adaptation 
of the Stephan iterative method yields an estimate for each cell ivhich is pro- 
portional to the harmonic mean of sample proportions on its diagonal while 
Fisher’s method of mimmnm chi-sguare yields estimates proportional to the 
corresponding quadratic means. 

As a numerical example of the foregoing problem consider the set of fre- 
quencies 

(7.5) nix , nil , Un , nn = 1, 4, 3, 2, 

obtained in a sample of 10 observations selected at random from a universe 
in which the cell poportions are known to be 

(7'-6) Pn , pi 2 , p2i , P 22 = p, 0 6 — p, 0.5 — p, p. 

As estimates of the parameter p, the ideal linear estimate is .15, Deming's 
adaptation of the Stephan iterative method yields .14, and Fisher’s method of 
minimum chi-square yields 1546 to four decimal places, the other tiro estimates 
being exact Tlio rc-iult' illustrate the imperfectly compensated errors of 
estimation explained pimionsly. The two sample frequencies on the principal 
diagonal {nn and rin) have greater relative dispersion than the frequencies on 
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the other diagonal. For this reason, the relative adjustments made by Deming’s 
method are greater and according to the principle of imperfectly compensated 
negative errors of estimation, the estimate of p obtained by Denning’s method 
IS smaller than the ideal linear estimate of p, Fisher’s method of mimmum 
chi-square yields an estimate of p which is greater than the ideal linear estimate. 
In fact, one should usually expect imperfectly compensated errors of estimation 
in statistics derived by Fisher’s method of minimum chi-square to be opposite in 
sign and about half as large as those in the corresponding statistics derived by 
means of Deming’s adaptation of the Stephan iterative method. 

At this point, it should be emphasized that Fisher does not recommend his 
own method of minimum chi-square in preference to the method of maximum 
likelihood. In fact, he presents the theoiy of estimation in such a way as to 
imply conectly that the method of maximum likelihood is superior, especially 
in small samples. Other wnters have noted the small differences between 
equations of maximum likelihood and those for mimmizing chi-square by Fisher’s 
method and some have even derived one set of equations from the other by 
neglecting higher order terms in a Taylor senes expansion These derivations 
are of no inteiest here because they seem to justify the method of maximum 
likelihood as a simple approximation to some more complicated method. This 
type of justification is both unnecessary and undesirable It is more useful to 
regard the method of maximum likehhood as an approximation to a method — 
least squares — tor ivhich the theory is simpler. 

Skeptical readers who find the foregoing argument unconvincing may be able 
to profit from the folloiving example Consider the problem of estimating the 
parameter p where 2p is the proportion of white balls in an urn. A sample of 10 
balls is selected and classified by the following process Each white ball is 
placed in one of the cells on the principal diagonal of a two-by-two table, the 
particular cell being decided by the toss of a coin A similar method is used for 
non-white balls placed in cells on the other diagonal Assuming that the results 
of this process are given by equation (7.5), wliich of the three alternative esti- 
mates of p given above should be preferred? Belief in the general superiority 
of Fisher’s method of mimmum chi-square seems to imply that the device of 
coin-tossing described in this example can be used in practical problems involving 
the estimation of the proportion of "successes” to secure estimates which are 
superior to the sample proportion — ^the ideal linear estimate in such cases. 
Even if it is possible to construct trivial special case examples supporting some 
complicated method for such problems the general use in practical problems of 
the coin-tossing device in connection with either Fisher’s method of minimum 
chi-square or Deming’s adaptation of the Stephan iterative method would be 
absurd as this example is intended to emphasize, 

8. The method of proportional distribution of marginal adjustments. The 

method of proportional distribution of marginal adjustments is a general method 
of adjusting sainple frequencies so that their row and column totals agree wdth 
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known expected marginal totals. In other words, the adjusted frequency for 
the cell in the ith row and the jth column is given by the equation 


(8.1) 

m*, = Uij — pid j — p 

where 

di. = m<. — Wi. , 

and 

d.j = m.j — n ,, , 


are the net adjustments in the sample cell frequencies of the fth row and the 
jth column, respectively. The asterisk is used to distinguish maximum likeli- 
hood estimates m,, and the ideal linear estimates m(, from the set of statistics 
based on equation (8 1). 

The method of proportional distribution of marginal adjustments yields ideal 
linear estimates when the universe cell proportions are proportional by rows and 
by columns, i.e , when 

(8 2 ) pif = p,.p.j. 

This important principle can be established by substituting in equation (7.4) 
of section 7 the quantities 

(8 3) c.j = np,.p , , 

Px = 0.5 4- d, /npx , 

and 


q, = 0.5 + d,j/np j , 

and reducing the typical equation of the ideal method of minimum chi-square 
to the form of equation (8.1) which defines the method of proportional dis- 
tribution of marginal adjustments. 

Even in the absence of exact proportionality, under which it yields fully 
efficient statistics, the method of proportional distribution of marginal adjust- 
ments has the following relative advantages over other available methods: 

(1) ease of extension to tables of higher order; 

(2) exact agreement with known (expected) marginal totals ; 

(3) simphcity of interpretation; 

(4) independence of computational errors; 

(5) rapidity of processing, 

(6) economy of effort; and 

(7) fully efficient criteria for testing the sigmficance of departures from 
proportionality of rows and columns 

Ease of extension to tables of higher order is a desirable property of phe 
method of proportional distnbution of marginal adjustments. Equation (8.1) 
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applies to the special case in which there are only two bases of classihcaticn. 
In the more general case sample observations are cross- classified according to 
r bases of classification, each cell frequency in an ?’th order table being the num- 
ber of observations in the corresponding rth order class whose expected value 
IS to be estimated The required adjustment for each first order class (obtained 
by subtracting the sample total fiom its known expected value) is distributed 
among the vaiious cells in proportion to the umverse totals of the corresponding 
(r — l)th order classes to which the cells belong. The general process is il- 
lustrated by 

(8 4) m*j: = -f p, ,d,,k + p.,d^K + p , 

the formula for estimating the expected frequency in the general cell of a third 
order table , 

Exact agreement ivith marginal totals follows easily fiom the method of 
proportional distribution and can be established algebraically by summing the 
estimation equation by first order classes; e g., summing equation (8.1) byrows 
and columns In practice, discrepancies are always either errors of lounding 
or mistakes in computation, they are never due to lack of convergence of iterative 
processes as is often true in alternative methods of estimation 

Although simplicity of interpretation is desirable in general, it is especially 
important ivhcn random sampling is an unrealistic abstraction For example, 
the method of proportional distribution of marginal adjustments has been used 
to estimate the cell proportions in a two-way classification of incomes from known 
marginal proportions and a detailed cross classification at an earlier date In 
this problem knoivn shifts in income distnbutions made it evident that certain 
cells previously vacant should not have the zero proportions which would be 
estimated for them by other available methods of estimation The ease with 
which the effects of the method of adjustment can be traced is important also 
m the analysis of the results of sample surveys in which various types of bias 
are important 

The method of proportional distnbution of marginal adjustments yields the 
estimated expected frequency for any cell by a single sequence of computations 
which IS independent of the corresponding process for any other cell Errors 
made in computing the estimate for any cell appear in marginal totals of esti- 
mates for all first order classes which include that cell. If only a few errors are 
made in a table they can be locahzed immediately and can be corrected without 
recomputing any estimates which are correct 

In certain types oi social surveys, rapidity of processing is so important that, 
as Deming puts it, “the delay of only the brief time requiicd for adjustment 
may not be advisable ” [1, p. 102], Under these conditions, it is important to 
have a simple formula like equation (8 1) in which substitutions can be made 
rapidly Even when the time element is relatively ummportant, the economy 
of effort and the ease of explaining the method to clerical assistants are often 
of practical importance. 
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Finally, departures from proportionality among rows and columns often 
provide the chief element of interest in research studies — ^not only in social 
surveys of the type illustrated in Deming and Stephan’s example but also in 
biological sciences. The most effective tests of significance for the purpose of 
presenting statistical evidence of lack of proportionality are those based on 
statistics like those derived by the method of proportional distribution of marginal 
adjustments whose efficiency is 100 per cent when proportionality is exact. 

Even when proportionality is not exact, the efficiency of statistics derived 
by proportional distribution may be close to 100 per cent under fairly typical 
problem conditions such as those in the example by Deming and Stephan wherein 
the other more complicated methods require several times as much computational 
effort, but have little advantage over the easier method with respect to effi- 
ciency of estimation in this particular problem. 

9. Suggested improvements in techniques. In section 7, a method was 
outlined by which it is possible to derive sets of maximum likelihood statistics 
by merely integrating available techniques without changing any of them 
In this section a number of improvements are suggested. At this point it should 
be emphasized that a given change is not an improvement merely because it 
yields slightly more accurate estimates or makes possible a slight saving of 
time and effort. In each case the research worker should consider saving of time 
and effort and accuracy of estimation simultaneously In particular, it seems 
likely that most social surveys of the type considered by Deming and Stephan 
are characterized by approximate proportionality by rows and by columns— 
conditions relatively favorable to the .simple method of proportional distribu- 
tion of marginal adjustments. It should be clearly understood that sug- 
gestions in this section are intended for those research workers whose problems 
justify a great deal more effort than is required to adjust sample frequencies 
by this simple method. 

Assuming that the problem at hand ivarrants the effort required to derive 
maximum likelihood estimates, the first consideration is the derivation of a 
set of first approximations to the m,j , and a set of values of pj(l), 

first approximations to the p. . Even if pi’oportionality by rows and by columns 
is not closely approximated use of values of the pi(l) provided by equation (8 3) 
are especially to be recommended. In the example used by Deming these 
values for the p,(l) are so much better than the values recommended by Deming 
that they save a large proportion of the effort required by the iterative process. 
If roAvs and columns are approximately proportional, equation (8,1) should be 
used to provide values of the in which case it is possible to use an itera- 

tive process similar to the one used by Deming but based on the typical equa- 
tion of maximum likelihood (7.3) to achieve a given degree of accuracy in the 
maximum likehhood estimates with even less effort Under favorable conditions 
such as those in Deming’s example the suggested iterative process yields excellent 
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approximations to maximiun hkehhood estimates by means of the following 
steps • 

] . Constiiict a set of first approximations to the r row components of the rs 
maximum likelihood divisors (oj + 5,) by means of the equation 

(9.1) ai(l) = n.Jnp^ — 1/2 

2 Compute successive approximations to the a, and h, by means of the equa- 
tions 

(9.2) h,{g) = [n_j — Sm,j(l)oi(( 7 )]/'ap. 3 , 

(9.3) a,ig -h 1) = [n, - 2m.j(l)5j(p)]/ap. , 

where in,/!), the first approximation to m,j , is derived by means of equation 
(8 1). Just as in Deming’s iterative process, the expiession in brackets is a 
series of products which can be subtracted m a single sequence of machine 
operations and the final division can be performed without having to record 
any of the intermediate results. 

3. Divide the sample frequencies by the maximum likelihood divisors to obtain 
the maximum likelihood estimates 

(9.4) m,j = nq/(a. + hj ), 

where limiting values of a; and b, are approximated as closely as desired by 
successive approximations in the preceding equations. 

Under unfavorable conditions, the iterative process of this section is not 
always the easiest way to obtain satisfactory estimates. For example, when 
samples are small and/or rows and columns are not approximately proportional, 
it is better to use the iterative method as originally presented by Stephan where 
sample frequencies can be used for first approximations to the c^j and these may 
be replaced by successively better approximations. 

The point made in the final paragraph of Fisher’s well-known book [3] that 
“in practice one need seldom do more than solve, at least to a good approxima- 
tion, the equation of maximum likelihood,” is strongly supported by the develop- 
ments of this article. In addition, the pioof that the method of least squares 
and the method of minimum chi-square always lead (by means of approxima- 
tions to ideal weights) to maximum likelihood statistics greatly facilitates the 
adaptation of techniques developed in coimection with these hitherto competing 
methods 
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A STATISTICAL PROBLEM CONNECTED WITH THE COUNTING OF 
RADIOACTIVE PARTICLES 

By Sten Maemqutst 

Insiitule of Statishcs, Unwcrsily of Upsala, Sweden 

1. Introduction. Our problem refers to random events forming a sequence 
in time or in space, e g. particles emitted by a radioactive matter. By omitting 
certain elements of the given sequence, say /, -we form another sequence, say g. 
The rule of omission involves an arbitrarily prescribed constant u The rule 
to be followed in forming g is: 

Case I : Let a be an element in / and g The next element to be included 
in g IS then the first element m / which follows a after a distance greater than u. 

Case II. Let a be an element in / and g. The next element to be included in 
g IS then the first element in / which follows a at a distance greater than u from 
the preceding element in /, whether this belongs to g or not 

When the events are represented by impulses emitted by a radioactive matter 
and feeding a recorder with a constant resolving time u, the new sequence con- 
sists of the counted impulses. The two cases correspond to the reaction of 
different types of recorders The distinction between the two transformations 
has caused some confusion It has, hoivever, been clearly pointed out by 
Ruark and Brammer [5]. 

V. Bortkiewicz [2] seems to be the first who has considered problems related 
to the transformed sequence Starting from investigations by Rutherford, 
Geiger, and others, concermng the number of recorded a-particles during a 
certain interval of time, say T, he observed that the distnbution of this number 
was similar to that of Poisson but with a slightly smaller dispersion, This fact 
he supposed to be caused by a constant resolving time u of the recorder By 
means of certain assumptions he tned to calculate the effect on the mean and 
the dispersion by the transformation in Case I, supposing the cumulative dis- 
tribution function F{t) for the distance between two consecutive elements in 
the sequence / is given by 

F{t) = 1 - 

where here and in what follows, t denotes a non-negative variable. 

Considering Case II with F{t) as above, Levert and Scheen [4] have recently 
worked out an expression for the distnbution of the number of elements during 
T in the sequence g. 

Gnedenko [3] has considered the distnbution of the number of lost elements 
in Case I with particular regard to the initial state of rest. 

Alaoglu and Smith [1] considered problems referring to successive trans- 
formations of a sequence. When, for example, a sequence of particles enters 
a tube-counter and amplifier, together acting with a resolving time ui , and 
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the impulses then are ieediug a recorder with resolving time ih > ux , the se- 
quence of lecorded impulses will be the result of two successive transformations. 
If we have a scaling circuit between the counter and the recorder, we have to 
make a transformation of another type between the two transformations in 
Case I and Case II. 

The present paper deals with the transformed sequence in Case I The 
distribution function F(() is supposed to be arbitrary. An advantage of this 
generalization is that the formulas deiived could be used in treating problems 
referring to successive transformations. 

The author wishes to express his sincere gratitude to Professor Herman Wold 
for stimulating discussions and valuable advice 


2. Derivation of distributions for case L Suppose that the sequence / 
has F(t) for distribution function for the distance between two consecutive 
elements. F(l) is supposed to be independent of absolute time (space), and of 
the preceding distance between two elements When not stated otherwise, 
we further suppose F(0) — 0, 

Now let G(i) be the distribution function for the distance between two con- 
secutive elements in the transfoimed sequence (/. Evidently G(i) also is inde- 
pendent of absolute time and of the preceding distance between two elements. 

We shall consider certain distribution functions connected with F(i). These 
functions will then be used in solving problems concerning the sequence g 

Let F„(i) be the distribution function for the distance between the first and 
the last of n + 1 consecutive elements in the sequence / Then P„(f) is given 
by the recursive system 


( 1 ) 


- a:) dF„(x); 
Jo 

Fx(i) ^ F(() 


As IS easily seen, we have 


(m, n 


1 ) 


Fr.+n(t) < F^(i)-F„(i); 

and, for t = u, 

Pn(u) —>^0, as n — > CO ; 

00 

^F„{u) < = 0 , provided that Fi(0) <1. 

nsal 

Alternatively, F„(<) could be deduced by the use of characteristic functions. 
Still considering the sequence f, let $(t) be the distribution function for the 
distance d between an aibitrarily chosen point and the following element. 
Suppose that the arbitrary point is chosen so that the distance between the pre^ 
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ceding and the following element is x. Under this condition we have, in usual 
symbols, 


Pid > 0 = 


X — t 


Hence, 


$«) = 1 - / 


X ~ t 


dR{;P) 


where J?(i) is the distnhution function for the distance a;. 

To deduce //(<) we suppose that the distribution F{t) has a finite mean, 

m = f idF(i). 

Jo 

By the definition of H(t), Ave then have 

mx) = i ndF(t). 

ttl J© 

Thus 

(2) m) = ^[j[ xdF{x) + {f^ 

The corresponding frequency function v>(0 is given by 

1 - F(t) 


<p(t) = 


m 


Consider n + 2 consecutive elements in f, say oo , ai , • • • , a„+i , where Oo 
is an element in the transformed sequence g. The probability P„ that the 
next element in g following Oo will be a«+i is given by 

P„ = Fn{v) — P„+i(u), (n = 1, 2, ■■■), 

Po = 1 - P(w). 

Now let Pn(<) be the probability that the distance between oo and a„+i is 
smaller than or equal to % when oo an o„+i are two consecutive elements in the 
sequence g. Then 

” AM - fw-iW I ~ - *>] liAM, 

Let G*(t) be defined by 

G*(t) = Z P„ • Pn(«) = P(0 - F(u) 

TIbO 

+ Z f — x) — F(u - x)] dP„(x); t > 

n»l 


U, 
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When G*{t) is a distribution function, then G*(i) equals G{t) . 

For ti < k ^ve obviously have G*{ti) < G*(ti). 

For L = «> 

(?*(«.) = 1 - F(u) + S f [1 - F(ii - x)] dF„ix) 

n»l 

ao to 

= 1 - F{u) + T,Pn{u) - Z^n+l(w) = 1. 

1 1 

Hence we take 


(4) 

G{t) = G*it), 

t > u 

) 


G{t) = 0; 

t :< u 

When the corresponding frequency functions g{t) and f{t) 

exist, we get 

(5) 

g(t) = f(i) + f fit - x)Mx) dx) 

n-1 JQ 

t > U. 

Dealing with a sequence of elements we are often concerned with the number 
of occurrences during a certain time T. 

Let the mean number of occurrences during T be M{T). Supposing that 

the mean m = 

f t dF(f) IS finite and that F(0) < 1, we have 
Jo 


(6) 

MiT) = T/m. 


We define 

Fit) 

KM - ^ 

for t > 

— e 

for t < { 


II 

for f > e 

for J < e 

and denote the corresponding means by Mi{T) and M 2 {T). 

As IS easily seen, 


JUi(e) < Mie) < M^ie). 



Using (2), 


MiW = 


.F{t) + 6[1 - F(.)] 


f X dKi(x) [ X dKi{x) 

Jo Jo 


= 


/ X dKi{x) 

h 


[l-e[l - F{,)f + . . . + - F{e)fFGr-^ + 


/ X dK%{x) 

Jo 
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Maldng N = T/t and summing, we obtaiii 

T T 


Mi(T) = 

M^{T) = 


f X dFCx) + eF(e) ’ 
Jo 


/ X dKx(x) n 

Jo 

T _ 

/ xdKo(x) m— / xdFix) 

vO •/Q 


T 


By choosing e arbitrarily small, we get 

M{T) T/m. 

Let P(7i, T) be the probability that we get n elements m / during a time T 
Suppose that the first of these elements, ai , comes at To + x, and the last, 
dn , sAi To X y . 

We then have 

pT—x 

(7) Pin, T) = i <p{x) dx [1 - F{T - x - y)] dP^-iiy). 

*/o •^o 


In (4) and (7) we have equations foi the transformation in Case I. Because 
of the general form of P(<) , the formulas also can be used when we are concerned 
with successive transformations. It can further be remarked that the trans- 
formation of a sequence of impulses by passing a scaling circuit is expressed by 
the system (1) . 


3. Results for a particular form for F [t). The preceding formulas will 
now be used for a special distribution function P(i). Suppose that the fre- 
quency function Jit) = dFit)Jdt is equal to the frequency function of the dis- 
tance between an arbitrary point and the following element. 

From (3) we get 


F'ii) 


1 - Fit) 


or, when F(0) = 0, 


(8) 

Fit) = 1 - 6"“'; 



(9) 

Jit) = ae~“‘, 

wheie 1/a = 

m = [ tfit) dt. 

Jo 

By means 

of the theory of chaiacteristic 

functions we 

have 

(10) 

fnit) = 

■“* dx-, 

fiii) = fit)] 

where 




(11) 

rj(x) = a f dt == 

a 


a — ix' 
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■f*<» - 5 £ 


(a — ixy 


Thus 

( 12 ) 

For n = 1, we get 
(13) 


c/a- 


la: 


dx 


By differentiating (13) n — 1 times with respect to a we obtain 

i (-1)”-V - 1)! £ 

Hence, from (12), 

(14) m = 


(a — fa:)" 


dx. 


,71—1 — at 

t e 


(w - 1)1 

From (5) we obtain the fiequency function for the transformed sequence 


(15) 


g{i) = ae-' + E 

nil ^0 


ae 


(n - 1)1 
G(t) = 0, t < u. 


dx = ae““e“ 


i > u 


The mean mg is given by 

r 

mg = a I 
•fu 


di = ±-hu. 


Remark; Suppose the constant u is allowed to vary independently of t and 
that the frequency function of wis y{u), we obtain 


(16) 


m, = I t dt [ g(u, Oyiu) du = [ - y(u) du -‘r f uy(u) du 
Jo Jo Jo CL Jo 


= - + m{u) 

CL 


Now let the sequence of elements, g, by means of (5) be transformed into a 
new sequence, h. When we are concerned with the counting of particles, 
emitted from a radioactive matter, let the sequence g consist of impulses from 
a counter-amplifier with resolving time u, feeding a recorder with resolving 
time Ui . Then the elements in h are the counted impulses, it being supposed 
that the tube-counter and the recorder reacts according to the assumptions. 
We suppose Ui > u. When % < u, the sequences g and h are identical 
Let gn(,t) denote the frequency function of the distance between the first and 
the last of w -|- 1 consecutive elements in g We find, in the same way as 
used in obtaining (14), 


(17) 


(?n(0 = 


(n - 1)1 


e'‘”"(« - nuy-\- 


t > nu. 
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Lei h(L) be the frequency function for the distance between two consecutive 
elements in the sequence h. Let further N be the greatest integer smaller 
than or equal to ui/u. 

Using (4) and (5) we obtain 


h,{t) = ae““e““‘ £ (iti - nR)" 6“”“; 
0 nl 

JV n 
-at a 


^ ^ Wi + It) 


(18) hu(t) = 00 ““ 6-“' 2 - (n + l)«]”c““, {N + l)u<t<u, + u- 


N-l 


hju{t) = ae““e-“‘ £ [i - (n + 1 )r]" 


0 n\ 


< t < {N !)«. 


The mean m;. is found to bo 
(19) JR;. 

We also have 

[ thi(() dt < Ml, < 


= “i + Fi + £ £ ~ 7^”^" 

^ ^ J L Vss n V 1 

f dt 


or 


F- + Wi + wl F £ -, (ri - nuT ) 

Ln J L 0 n' 


< m;. < ~ + Ml e““ F£ ^ (Ui - MM)'‘e““^“^“"“’l 
J L 0 j 


We now consider the number of occurrences during a time interval T. Using 
(6), (16) , and (19) we immediately got the mean numbers of occurrences during T 
By (3), we get for the sequence g 


(20) 


‘PoiO — 


an + 1 ’ 


I < %i 


Ot ou —at 

e e : 


i > M. 


au -b 1 

Inseiting (20), (15) and (14) in (7) and evaluating the integrals, we finally get 


(21) P.(M, T) = 


ttn-i — 2an + a„+i ; 


ffln-i ~ 2a„ + (m + 1) — 


aT 


au + 1 


T 

n<- -1 
u 

1 < n < - 

u u 


of I / ■ aT 

ttn-i - 2 n — + (,1 + 1) ; 

L ttM + 1_ OM + 1 ’ 


T T 

- < n < - + 1. 
u u 
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where 


a„ = — ^ E in -v), (n = 0, 

au + 1 «=o v\ ’ 


(22) " au + 

a_i = 0 

When M = 0, wc obtain 

" T" n” 

—aT 1 <M / \ 

an = e 2^ — - (?i — V). 

■UcaO V • 

For the sequence / we then get the Poisson distribution 


(23) 


Pf{n„ T) = 


1) ■ ')i 


The corresponding expression for the sequence h is inucli more complicated- 


4. A statistical experiment. The following statistical experiment will serve 
as an illustration of the scheme dealt with in this paper — the transformation of 
a sequence and the resulting formulas, especially (21) 

Groups of five figures, the last rounded up if necessary, have been extracted 
from tables of random sampling numbers (0) Let each group denote the fjist 
five digits for a decimal a:, arbitrarily chosen lietween 0 and 1. The variable 
X IS supposed to have the distribution function t for 0 < ^ < 1 We now define 
a new variable, y, given by 

(24) y = -/clog(l - x), lory = — blogxj. 

The variable y has the distribution function given by (8), viz. 

F{t) = 1 — e^"', where - = m = /clog c. 
a 

Transforming each group, or number x, according to (24) , we get a sample of 
consecutive distances bctw'een elements in the sequence / considered in the 
previous sections Choosing a constant u, we can construct the corresponding 
sequence g. Beginning with a point, arbitrarily chosen on the first distance, 
we can finally count the number of elements in successive intervals of the same 
length. 

Take /c = 1, u = 0,2 and IT = 1.5 Wo then have for the sequences / and g: 
w/ = - = log fl = 0.4343; m„ = ~ + u = 0.6343; 

a d 

<T/ = - = 0.4343; a, = - = 0.4343; 

a a 

MfiT) = — = 3 454. M,iT) = E = 2 365 
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The experiment yielded the folloRang, results . 

For the sequence /; For the sequence g ■ 

Number of elements 801 Number of elements 555. 

fn; = Q 460. frig = 0 648 

In neither case is the deviation betu'een the observed and thcoietical means 
statistically significant In fact ivc have : 

(m/ - m/iVsOO _ . ]! Q. - m„) ^ ^ 

’ o'c ' 

which gives P = 0 3 and P = 0 4, respectively. 


TABLE I 

Nos of vnlervals with n elements 


n 

Sequence / 

Sequence g 

Observed 

Expected 
according 
to (23) 

Observed 

Expected 
according 
to (21) 

Expected 
according 
to (23) 

0 

6 • 

7.6 

5 

8 2 

23.7 

1 

33 

26 1 

53 

42 5 

54.8 

2 

48 

45.1 

82 

81 8 

63.3 

3 

55 

51 9 

69 

72 2 

48.8 

4 

36 

44.8 

23 

29 2) 

28.1 

5 

32 

31.0 

0 

4.8^ 

13.0 

(5 

17 

17.8 

1 

0 2) 

5. o'! 

7- 

12 

14 7 



2.4] 

S 

239 

239 

239 

238.9 

239 

Mean 

3.331 

3.454 

2.310 

2.36 

2.31 



4 825 


4.624 

36.7 

p 


0.68 


0.34 

<0.001 


The functions a„ in (22) can be calculated by means of Pearson’s tables of 
the incomplete y-fiinction (7) . In the notation of these tables we obtain 

Hence 


n 


n X" TO — X 
au + 1 nl ~ au -F 1 


[1 - I{p, ff)], 
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where 


l = a(T-nu), P = :^7^; 


q = n - 2. 


In the present case, however, we only need the numbers up In a ^ . Accordingly 
the o„ have been calculated directly 

The resulting theoretical and obseived distiibutions lor the number of ele- 
ments during T for the sequences / and g will be found in Table I, For com- 
parison, a Poisson distribution, with the same mean as observed foi' the sequence 
§1 is given, The result of a x test is also shown in Table I Judged by the i 
test the distributions (23) and 121) agree fairly ivcll with the observed distri- 
butions, As was to be expected, the Poisson distribution cannot be used for 
the sequence g 
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the probability function of the product of two normally 

DISTRIBUTED VARIABLES^ 

By Leo A. Asoian 
Hunter College 

1. Introduction and summary. J^et x and y follow a normal bivariate prob- 
ability function with means X, Y, standard deviations <ti , 0 - 2 , respectively, r 
the coefficient of correlation, and pi = X/ai, p 2 = Y/ffi. Piofessor 0 0. 
Craig [1] has found the probability function of g = xyfffiiT 2 in closed form as 
the difference of two integrals For' purposes of numerical computation he has 
expanded this result m an infinite series involving powers of z, pi , pa , and Bessel 
functions of a certain type, in addition, he has determined the moments, semin- 
variants, and the moment generating function of s However for pi and pa 
large, as Craig points out, the series expansion converges veiy slowly. Even 
for Pi and pa as small as 2, the expansion is unwieldy We shall show that as 
Pi and Pa ^ ) the probability function of 2 approaches a normal curve and in 

case ?' = 0 the Type III function and the Gram-Charlier Type A series are excel- 
lent approximations to the 2 distribution in the proper region. Numerical in- 
tegration provides a substitute for the infinite series wherever the exact values of 
the probability function of 2 are needed Some extensions of the mam theorem 
aie given in section 5 and a practical problem involving the probability function 
of z IS solved 


2. Theorems on approach to normality. The moment generating function 
of z, IS [1] 

(pi + P2 ~ 2rpip2)d^ + 2pip20 

(21) M. m - 2U-a+’-wu + a-'-)>i , 

V[1 - (i+r)0][l+ (1 - r)e] 

Let 2 , and a-i be the mean and the standard deviation of 2 , and U = {z — S)/crj . 
Now 


(2 2 ) 2 = pip2 + r, = Vp? + p 2+ 2rpip2 -t- 1 + rL 

Using (2.2) we find in the usual way the moment generating function of 

—2rw -b (pi -f- p 2 -4~ 2 rpip 2 )w’^ + 4rV — 2iv{r^ — l)(pip 2 -|- r ) 

(2,3) 2[I - (1 -b r)w][l -b (1 - r)w] 

■\/[l - (1 + + (1 - J")!"] 

where w = fl/o-, . 


1 Presented to the American Mathematical Society, Oct. 28, 1944, New York City. 
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Consider r S 0 Then in the limit as pi and P 2 —> “ in any manner whatevei , 
(2.4) lim M,^C0) = e‘"'\ 

and by the theorem of Curtiss [2J on momcni, generating functions we see in 
the limit as pi , pa oo the probability function of z approaches a normal curve 
with mean, e, and variance o-j , ^ 0. 

In case — 1 -f f < r < 0, e > 0, some care is required wherever 

V p'l d' Pi + ^pipif 

occurs, If one uses pi + pi ^ 2pip2 , the proof goes forward quite readily. 
Hence we have proved the theorem’ 

Theorem (2 5). The distribution of z' approaches normality with mean z, 
and variance crl as pi and pj — > «> in any manner whatever, — 
e > 0 

It is evident in Theorem (2.5) we may allow pi , pa — > — “o without any other 
changes Theorems (2 6) and (2 7) are proved m essentially the same way 
as (2 6) . 

Theorem (2.6). The distribution of z appi caches noimality with moan z, 
and variance , i/ pi —»», pa —>—«>, —I ^ ?■ < 1 — e, e > 0. 

Theorem (2.7). The distribution of z approaches normality if pi remains 
constant pa — > “ , — 1 + < < r ^ 1, e > 0; or pi remains constant pa — > — co , 
— 1 S r < 1 — e > 0, 

Naturally in any of the theorems pi and pa may be interchanged, In practice 
Pi and pi are usually positive. The approach to normality is more rapid if 
both Pi and ps have the same sign as r. 


3. Numerical values. In order to show how closely the Type III and the 
Gram-Charlier Type A series approximate the probability function of z, f{z), 
or more precisely f{z, pi , pz , r) , we use numerical integration where 


/(s, PI, Pi, r) ^ hiz) - hit), 


= 2WT^^L 

(3.1) 



and Ii{z) is the integral of the same function over ( — “, 0), [1]. Now Ii{z) 
may be written as 




ip{t) = 


•v/27r ' 



(3(0 = e‘', fj = rhk. 


(3.2) 

where 
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We readily obtain Ii{z) Vl ~ by forming the product of cpih), Kh), 

and l/o, using numerical integration applying Weddle’s formula, the Gregory- 
Newton formula, or the simple rectangular formula depending on circumstances 
The rectangular formula [3] is remarkably accurate when the function T = 
i/>(ii)ip(fc)/3(i3)/‘'c in the interval t) to oo or 0 to — ra jg somewhat symmetrical. 
Appropriate tables for (piU), ip{U) (see [4]), ^(4) (see [5]) and 1/x (see [6]) are 
readily available, In the important case of the independence of x and y,r = 0 
and (3 2) becomes 

f " 1 

— ■, h = .T — Pi , = p2 — - . 

X X 


4. Approximations to /(z). When r = 0, the standard seminvariants fs, 
and ^4 of z are 


( 4 . 1 ) 




6pi P2 


t = ~H pg) d~ 1] 

(P? + pl + 1)= 


remembering 

3 = P 1 P 2 , ffa = "n/ Pi + P 2 + 1 . 

In the Pearson system (see [7]) 5, the criterion, is 
(4.2) 5 = 

6 + ^4 

and for the probability function of z 

(4 3) 5 = 2(p^ + P2 + l){2(pi + pa) + 1} — ISpipa 

(pl + P2 + l)t(pi + P2 + 1)^ + 2(pi + P2) + 1] 
and if Pl = P2 = p 


(4.4) 


^ __ 2(4p^ + l)(2p^ d~ 1) — 18p* 

(2p’ + l)[(2p* + 1)® + (4p“ + 1)] 


Now 5 = 0, ^3 0, for the Type III function, and clearly lim 5 = 0. 

By use of (3 3) the accurate values of /(S) have been calculated for various com- 
binations of Pl and p2 and compared with the Type III approximation using z, 
Fz , fs . 

(4.5) Investigations so far completed show that for pi ^ 4 and P2 ^ 4 simul- 
taneously, and 1 5 1 ^ .008, the Type III approximation will provide values 
of tz correct to three significant figures at least where 


(4.6) 




.05 S a . 005 , 


These are the values of which would be needed in testing hypotheses The 
exact values of and for for vanous values of pi and p2 less than 4 wU be 
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determined it is hoped in the future and will be published along with the com- 
parisons of the Type III values of U with the accurate values of in the im- 
poitant borderline cases of pi = pa = 2, and pi = pa = 3, The values of f(z) 
for pi = p 2 = 2 and pi = pi = 4 have been calculated but these are being with- 
held for a more complete table. The table of values of S, a-^ , , and 6 

(Table II) shows then that the Type III function is excellent along a band about 
Pi = P2 1 since fa 7^ 0, and S is very small. 

We use the Gram-Charher Type A senes of three terms to approximate the 
probability function of z in 4 units 

(4 7) m ~ ^(0 - If W + If ’ 

in the usual notation. 


TABLE I 


i. 

f{t,) Correol. value 

Normal Curve 

Gram-Oliarlier 
Type A 

.9950372 

.2400307 

.2431710 

.2408235 

1 4925558 

. 1275209 

.130970 

. 127484 

1.9900744 

.0538243 

.0550708 

.053704 

2.4876930 

.0184000 

.0180791 

.0184500 

2 9851110 

0052477 

.0040338 

.0052944 

3.4820302 

,0012009 

.0009272 

,0012804 

3.9801488 

.0002011 

.0001449 

.000260 

4.4770074 

.0000467 

.0000177 

.0000425 

4.9751800 

.00000745 

.00000108 

00000555 


(4.8) For I fs I < .5 and fi < 4 simultaneously the Gram-Charlier Type A 
series is quite adequate for finding probability levels such as those of (4.6). 
These will in general give 3 sigmficant figures for or 4^’ In the special case 
pi = 0, Pa = 10, the Gram-Charlier Type A series differs from /(O very slightly 
in the range 1 ^ 1 fa 1 < <» (see Table 1). Naturally the Gram-Charlier ivill 
be used wherever Type III is not indicated, although there exist some over- 
lapping regions where cither one may be used It should be noticed that the 
approach of /(«) to normality is more rapid along a row than down a diagonal. 
In case either pi or pa is negative, we may make use of the equation 

(4.9) J{z, -PI , P 2 , r) = /(-z, Pi , p 2 , -r). 

We note that when r = 0, f{z, pi , pz) always possesses a discontinuity at z = 0, 
(see [1]). A table of z, o-^ , fa , f.i , and S is provided for values of pi and p 2 from 
0 to 10 inclusive. 
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TABLE IF 


\ 

\ PI 

P2 

2 

4 

0 

8 

10 


0 

0 

0 

0 

0 


2.236068 

4.123106 

6.082762 

8.062258 

10.049876 

0 

0 

0 

0 

0 

0 


2.160 

.685121 

.319942 

.183195 

.118224 


.529 

.205 

.101 

.059 

.039 


4 

8 

12. 

16. 

20. 


3 

4.582576 

6.403124 

8.306624 

10 246951 

2 

.8 

.498784 

.274256 

.167493 

.111531 


1.259259 

.657823 

.289114 

. 172653 

.113742 


.020 

.056 

.056 

.042 

.031 



16. 

24 

32. 

40 



5.744563 

7.280110 

9. 

10.816654 

4 


. 506408 

.373206 

.263374 

. 189641 



.358127 

.224279 

. 147234 

.102126 



-.0084 

0049 

.014 

.016 




36. 

48. 

60 




8.544004 

10.049876 

11.704700 

6 



.346314 

.28373 

.224503 




.103258 

.118224 

.087272 




- .0054 

-.00083 

.0038 





64. 

80 





11 .357817 

12.845233 

8 




.262088 

.226472 





.092663 

.072507 





-.0034 

-.0015 






100. 






14.177447 

10 





.210551 






.059553 






- .0023 


* The first value in a cell is I, the second 4^31 the third Is , the fourth I4 , the 
fifth 6. 
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6. Some extensions. Wc may generalize our results to any case where a. 
and y are distributed approximately in a normal distribution such as the dis- 
tribution of the product of two means, when the sizes of the samples Ni and Ni 
are large and consequently pi and pa will be large. Another example occurs if 
X and y each follows a BernouUoi probability function with parameters pi and 
p-z respectively where the number of trials in each case is large. We must warn 
the reader that the condition pi — > oo , — > co alone does not mean that the dis- 
tribution of z approaches normality. Both x and y must be distributed normally. 

The actual problem wliicli gave rise to this investigation was the question 
of determining the sum of a great many variates [8], Let T variates Uj , vz, 
, Wj. be given whose sum A = 2Zr=.i IS desired. Clearly 

A = TFp , = S vjT. 

Now let us estimate A by A! = 7aF, where T, is an estimate of T and F, is an 
estimate of Vp . If is very small, pi = T/ir^^ will be large and pz = f’p/cy 
= ■\/NVp/(Tp will be very largo. Assuming T, is distributed normally and 
obviously Fs is distributed normally for N large, we see by the theorems of this 
paper that X will be distributed normally Confidence limits for A may be 
calculated in the usual fashion as =b yo-jj, where y is determined by 



with a generally chosen as .025 or less and 


o-j = + F5v,7/-|- 4 


Stratification is also possible. It is interesting to note that many functions which 
occur m life insurance are products. Such applications will be treated fully 
elsewhere. Naturally the critical region whether both tails or one tail of the 
distribution should be used depends on the alternatives to the hypothesis being* 
tested. 

Generahzations of the main theorem are possible for the probability function 
of 2 = IX<=i X, where xi, Xz , ■ Xr follow a multivariate normal probability 

function. These will be investigated in a later paper. It may be noted that 
J. B. S. Haldane has investigated the distribution of a product along different 
lines [9]. 
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NOTES 

This section is devoted to brief research and expository articles on methodology 
and other short items. 


A REMARK ON CHARACTERISTIC FUNCTIONS 

By A. Zygmund 


University of Pennsylvania 

1. Let F{x)^ — CO <a:<+c)Ojbea distribution function, and 

ip{t) = r”c'‘^dF(x) 

J—00 

its characteristic function. It is well known that the existence of <^'(0) does 
not imply the existence of the absolute moment 

(1) r”|a:|dF(a:). 

J—eo 


A simple example is provided by the function 


^(i) = C 


cos nl 
n® log n 




where (7 is a positive constant. Since the scries on the right differentiated term 
by term converges uniformly (see [1]), exists (and is continuous) for all 
values of t, and in particular at the point t = 0. Obviously <pit) is the char- 
acteristic function ot the masses C/2n*logw concentrated at the points ±n 
for n = 2, 3, • • . The constant C is such that the sum of all the masses is 1. 
The divergence of the series Sl/nlogn implies that in this particular case the 
moment (1) is infinite. 

In a recent paper (see [2], esp. p, 120, footnote), Fortet raises the problem of 
whether the existence of ^'(0) implies the existence of the first algebraic moment 

/ +« pX 

xdF{x) = lun / xdF{x). 

M 2-*+» J~X 


The mam purpose of this note is to show that this is so We shall even prove 
a slightly more general result, 

A function ^{t) defined in the neighborhood of a point U is said to be smooth 
at this point if 

'I'ito + /i) + ^(fo — h) — 2i/'(^) _ Q 
A-++0 h 

Clearly, if ^ has a one-sided derivative at the pomt to , the derivative on the 
other side also exists and has the same value. Thus the graph of \h(t) has no 
angular point for t = U , and this explains the terminology. If \l/'{lo) exists and 
is fimte, \l/{t) IS smooth iox t = to . The converse is obviously false, since any 

272 
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function whose graph is symmetric rvith respect to i = i!o is smooth at that 
point. 

Theorem 1. If the characlensUc function <p(t) is smooth at the point 0, then 
a necessary and sufficient condition for the existence of (p'{<S) is the existence of the 
moment (2). The value of (2) is ~i<p' (0). 

In particular, the existence and hnitcness of <^'(0) implies the existence of (2). 
That the converse is false, is obvious. For Oa , ax, a^ , ■ ■ ■ are positive num- 
bers and ao -h 2ai -H 2 a 2 -h • • • = 1, then f/it) = cio + 22? a„ cos nt is the 
characteristic function of the distribution function F(x) corresponding to masses 
concentrated at the integer points d=n and having the values a„ there. Owing 
to the symmetry of the masses, the number (2) exists, and is zero even if <p{t) 
is non-differentiable for i = 0 (we may e g. take for (p{t) the Weierstrass non- 
differentiable function C 2? a” cos where (7 is a suitable constant) 

Proof. We may write 

ip(t) = / cos xtdG(x) +i sin xtdO(x) = i^i(t) + iip^it) 

Jq Jo 

where 

G{x) = F{x) - Fi-x), H{x) = F{x) -f- Fi-x). 


Thus 

(3) 0 < 1 All I < AO. 

Since (p(t) is smooth at the point 0, and since \j/xit) is even, 4>i{t) odd. 


0 = liin 
/.-►+0 


(p{h) -j- <p{—h) — 2<p{0) 


h 


= 2 lim 


Mh) - UO) 


so that, replacing h by 2h, 


i 


” sin^ hx 


h 


= -2 lim [ 
h-*+0 Jo 


dG(x) -A 0 


1 — cos hx 


dO(x) 


Since the integrand is positive we obtain successively 


f 

I 


h 


dG{x) = o(l). 


(i'“) 


dO(x) = o(l), 


( 4 ) 


J nim 

' x^ dQ{x) = oQi ^), 

D 

Ath 

xUG(x) = 0(O, 

Jl/lh 

pllh 

/ dG(x) = oQi). 

Jl/2h 


sxsh-A 0. 


( 5 ) 
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Since is even, the arnoothness ol' (o(Z), and so also ol at the point 

t = 0 implies that ^i(O) exists and is zero If li — > +0, 


h (fc) — 1^2(0) _ r sin xk 


h 


Jo II >^0 Jllh 


B, 


/ pHh pRih 

B„ I < ;r' I (III I < irH (10+ do + + • 

hth \Ji/h J2//> •‘i/'i 

^h-^o(h + h/2 + k/i+ ■•)=o(l), 


by (3) and (5). Also 

pilk »i//i 

A, - xdH== 

Jo Jo 


sm kc 
hx 


- - iVud/f = f"'o{x'‘h^)j 


xdQ 


fVh 

= 0{x''h)dG = o{l), 

Jo 


by (3) and (4). Thus 

= 0 ( 1 ) + rx(m = 0 ( 1 ) + rxciF, 

fl Jo j^l/h 

aad so 

?(^9) = o(l) + 

Ii j-i/i, 

It follows that the existence of (2) is equivalent to the existence of the right- 
hand side derivative of <p(Z) at the point i = 0, or, on account of smoothness, 
to the existence of ^'(0) Moreover, the value of (2) is —tip'{Q). This com- 
pletes the proof of Theorem 1 


2. Suppose that a function i//(Z) defined near the point to satisfies for /i— >0 
a relation 

i/'(Zo -b h) =01(1 + cti/l/ll + • ■ + (Xk-ih'‘ ^/(k — 1 ) ! + [fflA. + cr(l)]/i^//cl, 

where oq , ai , ■ , au are constants. Then a*, is called the kth gmeralwed de- 
rivative of i at the point to . It will be denoted by i/'ty {to) ■ The existence 
and finiteness o( {to) implies the existence of i/'(w(^o) and both numbers 
are equal. 

Another generalization of liigher derivatives is based on the consideration of 
the symmetric differences 

hi'^'(fo) = ^{io 4" h) — \p{to — h), 

^iHto) = Kto + 2h) - 2^^(^o) -b ^l({to - 2h), 

hfci/'(Zo) = ^{to -b 3ft) — 3^(Zo + ft) -b 3>/'(ft — ft) — i/{to — 3ft). 
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If Aai/'(/o)/(2?i)*' tends to a limit as h +0, this limit is called the fcth sym- 
metric derivative of \p at the point ^o We shall denote it by Dijif'iU). Clearly, 
Bh'PiU) exists and equals if the latter number exists. 

It is a simple matter to prove (see [3]) that if h is a positive even integer, 
and if the characteristic function ip{t) has at f = 0 a finite symmetric derivative 

a:*' dF{x) exists, and its value is 

Conversely, the existence of / dF(x) obviously implies (for k even) the 

w— ao 

existence and continuity of for all t, and in particular at the point t = 0. 

In order to obtain an extension of Theorem 1 to the case of derivatives of 
odd order, we have to generalize the notion of smoothness, We shall say that 
a function 4/{i) satisfies for t = k conditioh , (/c = 1, 2, • • ), if 

At^^>/'(fo) = o(Ji^) as h — > “l~0. 

Tor it = 1, condition Sje is identical with smoothness at tn Clearly, if i/'(h(^o) 
exists, i/' satisfies condition Sk at io . 

Theorem 2 Suppose that k %s a posihve odd znleger, and let ip{t) he the char- 
acteristic function of a distribution Junction F(x). If <p satisfies condition Si 
at the point 0, a necessary and sufficient condition for the existence of Diip(O) is 
the existence of the symmetric moment 

(6) f dF{x) = lim f x'‘ dF{x) 

J-M X-»+« J-X 

whose value is then equal to i~’‘Dk<piO). In particular, the existence of v>(j:)(0) 
implies that of (0) . 

The proof of Theorem 2 is analogous to that of Theorem 1. Let 0{x) and 
H{x) have the same meaning as before. Since /c -h 1 is even, condition Sk 
at the point i = 0 gives 

a‘+V( 0) = /'” (e’"'* - dF{x) = {sin xh)’‘'^^ dF{x) 

= [ (sin xh)’°^'dG(x) = o(h'‘), 

Jo 

so that 

/•i/i 

/ (sin xh)^'^^ dO(x) = o(ft) 

Jo 

rilh 

( 7 ) x'‘^UQ(x) = o(h-") 

Jo 

(8) / dGix) = 
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On the other hand, 

_ r'^°°/sinxh' 

* (2/i)* \ xh , 

plfll 

= + / = + Ba , 

Jo Jl/li 

say. Here 

J -w r [• 2 ih fiih 

dGix) = /r* + + • • 

= H- ■ J = o(i), 

by (8) . Since 

= {1 + 0(m“)}^ = {1 + Oiu)}^ = 1 + Oiu) 
for small u, we immediately obtain 

rllli rllh 

Ah - x'‘dH{x) = / 0(/i.O d(?(a3) = o(l), 

Jq Jo 

by (7) . Collecting the results, we see that 



which completes the proof of Theorem 2. 

One more remark. By Theorem 2, the existence of the first moment is equiv- 
alent to the existence of the first symmetric derivative 

7D(i)ip(0) = lim)._o [v>(^) — ip{ — h)]/2h. 

In Theorem 1 we have a corresponding result for ordinary first derivative 
tp'iff) = lim /,-,0 [v>(/i) — ^(0)]/fi. 

There is no discrepancy here since at every point where ^ is smooth the two no- 
tions of derivative are equivalent. 
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A LOWER BOUND FOR THE VARIANCE OF SOME UNBIASED 
SEQUENTIAL ESTIMATES 

By D. Blackwell and M. A. Gihshick 
Howard University and Bureau of the Census 

Consider a sequence of independent chance variables xi , , with identical 

distributions determined by an unknown parameter 6. We assume that Ex,= 6 
and that Wk — Xi + ■ ■ + a;*, is a sufficient statistic for estimating 9 from 
xi , ■ , Xk . A sequential sampling procedure is defined by a sequence of 
mutually exclusive events Sk such that Sk depends only on , ■ • • , x*; and 
S P{Sk) = 1 Define W = Wk and n = k when Si occurs. In a previous paper 
by one of the authors [1] it was shown that if Sk = Wk C(Si + • ■ • + Sk-f), 
(where G{A) denotes the event that A does not occur), the function Y{W, n) = 
E{xi I W, n) IS an unbiased estimate of 9, and o-°(T7) < It is the purpose 

of this note to obtain a lower bound for <r^(y) . Our result is ' 

Theorem i. o-“(F) > • 

We remark that the lower bound is actually attained in the classical case of 
samples of constant size N For in this case, (see [1]), V = E{x\ j Wn) — Wn/N. 
In fact we shall show that in a sense this is the only case in which the lower bound 
is attained. 

The proof of Theorem I depends on certain properties of sums of independent 
chance variables. These, formulated more generally than is required for the 
proof of Theorem I, are given in 

Theorem ii. Let xi , x^ , ■■ be independent chance variables with identical 

distributions, having mean 6 and variance a{xi). Let furthermoi e (iSt.) be any 
sequential test for which E{n) is finite Let W = Xi Xk when n = k. 

Then 

(a) a-\W - dn) < a (xi) E(n) 

(b) If (T^{n) IS finite, the equality sign holds in (o) 

(c) E[xi{W - dn)] = V (xi). 

Proof of (a) Write y^ = Xi — 9, and define F = i/i + ■ ■ + when 

n = k By definition, 

(1) cr'CPF - On) = E f (?/i + • + ykY dP. 

k^l Jsh 

To prove (a) , we must verify that the series on the right of expression (1) con- 
verges and has sum <ff“(xi)F/(n.) Now 

E f {yi + ■■■ + VkY dP 

(2) < E f {yi+ +ykTdP + f (j/i -f • • • + yyY dP 

= E f yl dP + 2 E f yk{y\ + • ■ ■ + dP . 

n fe ^—2 V n ^ b 
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Since the event {n > i} is independent of y/, , each term in the second sum 
vanishes and the first sum becomes 

S f yldP = a{x^ Yj P{n>h\ 

A— 1 

(2) = (r\a:i)[P[n = 11 + 2P{n = 2} + ■ ■ ■ NP[n = N] 

+ NP{n > JV)] < (r'‘{xi)E{n). 

This establishes Theorem 11(a) 

Peooe of Theoeem ii(b) . Write 2 , = | y, | and let Z = 2 i + • ■ + when 
n = k. From (a) it follows that cr’*[(^ — nE(Si)] is finite. If in addition, 
< 00 then E{Z‘‘) < «> . Thus the series 

(4) £ f (2l + • • • + Zl? dP= E [ 2.2, dP 

A— 1 ''3k l^'t,j^fc<co ‘'3k 

converges, so that the series 

(6) E f y^y, dP 

converges absolutely. The terms of the latter series may be arranged to yield 
(A): E f (yi + • • + Vkf dP = <r^ (W - 9n) 

1-1 •'a* 

or to yield 

B: E f yldP + 2Y [ VkiVi + • • + Vk-x) dP = a\x{)E{n). 

*=1 1-2 ‘'(na*,) 

This proves Theorem 11(b). 

Proof of Theorem n(c) It follows from Theorem 11(a) that Exi{W — 6n) 
is finite If we show that 

(6) E(W — 6n I iJi) = .Xi — d, i e. E{Y \ yi) = yi, it will follow [1] that 

(7) E[x,iW - dn)] = F[ai(xi - 0)] = a-\xi). 

To verify (6) , it is sufficient to show that if f{xi) is the characteristic function 
of an event depending only on xi (i.e. /(xi) = 1 when the event occurs, /(xi) = 0 
otherwise) 

( 8 ) E{fy,) = E(JY). 

Write <^i = 0, (<)<=/• ( 2/2 + ■ ■ ■ + y,),^ > 2. 

Then it easily verified that 

(9) E{<t)i 1 Xi , • • , X.) = </). for j > i 
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Hence it follows [2] that E(l> = 0 where 4> = 4>i when n = i. In our case ■#> = 
jY — fyi , and E<t) = 0 yields (6) This completes the proof of Theorem II 
Proof OP Theorem I In [1] it is proved that S(.ri(T'F — 6n)) =E{Y{W — 0n)] 
Hence employing Theorem II we get 

(12) = E[V{W - On)] = ^{V)c{W - en)p 

where p, (0 < p < 1)) is the coefficient of correlation between V and W — 6n. 
Substituting for ffiW - On) we get 

< <r(F)a(a:i) 

< <r(V)a(xi) 


Solving for ff(y) we finally obtain 


(14) 


AV) > 


E(n) 


which proves Theorem I.^ 

If (r\n) is fini te, the equality sign in (14) AVill hold if and only if p = 1. We 
shall now prove the following 

Theorem m. Let N he the mimmum value of n for which P(,n = N) 0. 
Then, a necessary and sufficient condition that p = lis that P{n = N) = 1. 
Proof. The sufficiency of this condition follows from the fact that if 
,= Ef) = 1, V = W/N To prove the necessity of this condition, ive 
observe that if p = 1, 7 is a linear function of W — n9. That is. 


(15) 


V = a(IF - nd) + /3. 


Nmv, since EV = 6 and E{W - nd) = 0, it folloivs that = 6. Also, since 
by hypothesis <r^y) = <T\xf}fE{n) and a\W - nd) = a(,xf)E{n), it follows 
that a = 1/E(n). Hence the estimate V is given by 


(16) 


y = 


W — nO 


+ e 


' Under certain regularity conditions Cramdr has obtained the inequality 

where f = f (a:, 0) is the density function of x ([3], p. 475) Thus with the same regularity 
conditions, our inequality yields 

aKV) > l/E(.n)E 

which IS a special case of the results presented by J Wolfowitz in this issue of the Annals. 
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Let A^be defined as above. We note that N < ^ since by hypothesis E{n) < « 
Let Yn be the estimate of d when the sequential test terminates with n = iV. 
Then Vn = W /N Substituting this value in (16) we get 


(17) 


^ - a =JL 

N Bin) 



We exclude the trivial case where W = NO. Then (16) yields E{n) = N 
That IS P(n = W) = 1. This proves the theorem. 

We remark that N may be a function of 6 but for a fixed 6, n — N la fixed 
when p = 1. 
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AN EXTENSION TO TWO POPULATIONS OP AN ANALOGUE OF 
STUDENT’S i-TEST USING THE SAMPLE RANGE 

By John E. Walsh 
Princeton University 

1. Summary. The modified f-test considered by Daly^ (see [1]) is used to 
develop one-sided significance tests to decide whether the mean of a new normal 
population exceeds the mean of an old normal population having the same 
variance. Significance tests are also developed to decide whether the mean of 
the new population is less than the mean of the old population. These tests 
require very little computation for their application and are approximately as 
powerful as the most powerful tests of these hypotheses. 

2. Introduction. Let iq , • • , r„ , {n < 10), be independently distributed 
according to a normal disitribution with zero mean and unit variance Let r(») 
denote the uth largest of the r's. Then Daly has shown how to determine 
numbers such that 

Pr[r/irM - r(i)) > p„] = a 
Pr[f/ir(„) - r(i)) < -g„] = a 

This note will use these relations to develop easily applied significance tests to 
decide whether the mean r of a new normal population exceeds the mean n of 


' This problem is also considered by Lord iu [2] This note was in proof when [2] appeared. 
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au old normal population i\itli the same variance Significance tests are also 
developed to test v <. The simplest case considered is that of testing a new 
sample value x on the basis of n past sample values j/i , • • , Then the 
significance test at significance level a to decide whether v exceeds /x consists in 
accepting r > m if 

x>y + gc,\/n->t- l[?/c«) - ya)], 

where is the wth largest of • , i/„ . 

The significance test oi v < y consists in accepting v < yif 

X < y - ga Vn + 1 [y^ - yu)]. 

These testa are generalized to the case in which x is the mean of a sample of 
size r from the new population, each of i/i , • , y„ is the mean of a sample of 
size s from the old population, and 2 is the mean of a sample of size i from the 
old population Then the tests at significance level a take the form 

Accept V > y if X > (1 - Oi)y + Ciz + ?a[i/(„) - ya)], 

( 2 ) 

Accept V < y%f X < {1 — Ci)y + Ciz - gAy^ - y^y], 

where Ci is a given constant which is selected by the person applying the test. 
The introduction of the terms z and Ci allows less reliable past information to 
be utilized by lumping it together in the z term and using the constant C\ to 
weight this information according to its relative importance with respect to 
the y'&. 

The power of test (2) is compared with that of the corresponding Student Z-test 
for the case Ci = 0 and n < 10 In this comparison the quantities x,yi , ■ • ■ ,y„ 
are considered to be the given sample values ivhich are used for the test, that is, 
the quantities from which the means x, yi, • ■ , yn were formed are not given. 
It is found that the power of the Student Z-test is only slightly greater than that 
of the corresponding test (2) For the cases considered, how’ever, it is well 
known that the most powerful test oi v > y using the quantities x, yi , ■ , i/„ 
is the appropriate Student Z-test Similarly for testing v < y Thus the tests 
(2) considered aie appioximately as powerful as the most powerful tests of 
V > y and v < y which use x,yi, • ■ , y„. 

Examination of (2) shows that the amount of computation required for the' 
application of one of these tests is small Consequently the tests (2) have the 
desirable properties of being easily computed and nearly as powerful as any 
tests which could be used for the given hypotheses. This suggests their use in 
repetitive testing procedures which are concerned with the testing of the mean 
of a new sample on the basis of the means of previous samples. 

3. Statement of tests. In this section three significance tests of increasing 
generaUty are stated It is to be observed that each test is a particular example 
of the test following it so that tests (A) and (B) are special cases of test (C) 
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The reason for stating tests (^) and {B) is that these tests have a ranch simpler 
appearance and will cover most cases of practical application. 

(a 1) Let each of a:, i/i , ■ , repiesent the mean of a sample of size r, let 

the values of the sample whose mean is x have the distnbution N{v, and the 
values of the samples whose means are i/i , • , y-n have distribution 

where the notation iV(f, a) denotes the normal distribution with mean ^ and 
variance o■^ Then the significance test of r > /i significance level a is 

Accept V > fi X > y + Oa ^ [f/rni — 2/a)]. 


The significance test to decide whether v < pis 


Accept V < p if X < y — </(«) 




[?/(") — 2/ai]- 


(B) . Let X equal the mean of r sample values from N{v, (f^) and each of 

2/i , ■ ■ • 1 2/n equal the mean of s sample values from N (p, . The significance 

test for V > M at significance level a is 

Accept V > p if X > y + Oa + J \y(n) ~ J/(l)]. 

The test of v < is giyen by 

Accept V < p if .1 < § - fifa /|/^ + j [2/(n) - 2 /( 1 )]. 

(C) Let X equal the mean of r sample values from N{,v, v) , each of t/i , • ■ , 2/i> 
equal the mean of a sample of size s from N(,p, a^), z equal the meah of a sample 
of size t from N{p, and Ci be a given constant value. Then the significance 
test of r > p at significance level a is 

Accept V > pif 

X > {1 - Ci)y + Ciz + [f/cr.) - f/o'ls'o 



The significance test to decide whether v < pis 
Accept V < pif 


a: < (1 - Ci)y + Cjz - [!/(„) - y^i)](j„ • 



Values of for a = .05 are given in Table I. These values were listed by 
Daly in [1] ° 


“ Values of ga for a = .06, .026, 01, 005, 001 and 0005 are listed m Table 9 of [2] for 
sample sizes from 2 to 20, 
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4. Derivation of tests. As tests (A) and (S) are particular cases of test 
(C), it IS sufficient to derive test (C) 

TABLE I 


Estimated Values of g 


n 

(7 05 

3 

.882 

' 4 

.526 

S 

385 

6 

.309 

7 

.260 

8 

.227 

9 

.202 

10 

.183 


Let the quantities x', yi , ■ ,y'n, z' be defined by 

_j _ {x ~ v) yjr ,/ _ {y^ — m) Vs 

X “ ) Vx i 

C O' 

3 

— m) Vt 

Z — . 


(^ = 1> ■ ■ , n), 


Then x', y'l , • ■ , y'„ , z' are independently distributed according to N(fi, 1). 
Define 


r„ = (^iiy'u — 2 2/» + Kix' + KiCz'^ , 
It is easily verified that 


(m = 1, • ■ • , n). 


E(7-„) = 0, Eirl) = -2 [K? + (1 + C^)Kl - 2Jfx + n] 
E{rui\) = ^ [(1 + C^)KI — 2/Ci + n], 


{u 9^ v). 


Thus, if Ki and satisfy the equations 

(3) (V+ + 

(1 + C^)Kl - 2Ai + n - 0, 

the Ty, will be independent of y when y, = v. Also they will be independently 
distributed according to N{Q, 1). 
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Rewriting the !•„ in terms of a:, 3/1 , • - • , y -a , z one obtains 
(4) r„ = - E ^ + ,)j. 

Using (3) the mean of the is found to be 

I^et rcu) denote the wth largest of ri , • • • , r„ . Then from (1) 


ct = Pr[f/(r(„) - r(i)) > g^] = Pr + C* V 

+ /j/" ^ — (r — /t)^ ! {yt^n) — Viv,) > 

It is easily proved from (3) that 


-Ki ^ 
KWr 



Wr + GVty \ 
s(l + C^) ) 


Choosing the positive sign, putting C 



Cl , and letting p. = v one obtains 


Pr 


X > 


(1 — Ci)f/ -\- Cl z 


+ Il/lnl “ 3/(l)]ff« 



verifying the first part of test (C) 
choosing the negative sign for 
the second part of (1)) 


The second part of test (C) is verified by 
(or by repeating the above argument using 


6. Power comparison with t-test. Ijet x,yi, ■ ■ ,y„ satisfy the conditions 
of test (-B) in section 3. Then Student’s t using x, yi , • ■ ■ ,3/,, is given by 

^ ^ [x - ly - (v - g)1 _ / n - 1 ~ 

The Student t-test based on this value of t furnishes the most powerful test of 
V > p (and V < p) using x,yi , ■ ■ ■ ,y„ The purpose of this section is to show 
that test (B) has approximately the same power as this Student t-test for n < 10. 

Daly has shown (see [1]) that if rj , • • , r„ are independently distributed 
according to o-“), then the test based on 

(r - ?)/(»•(») - r<n) 
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has approximately the same power for testing ? > 0 (and ^ < 0) as the corre- 
sponding Student i-test based on 

( 5 ) t = (" - ^)yn{n - 1 ) 

y P (r. - ff 


for n < 10. 

Using the notation of section 4 let 


-v/ s 


KiVu — y, Ki 




(u= 1, ■■■ ,n), 


where ^ > 0 
iS-2 


Then from consideration of (4) with G = 0 it is seen that the 


are independently distributed according to NQ, </), where f equals a positive 
constant times {v — /i). Following the derivations in section 4 with (7 = 0, 
it is seen that the test of ^ > 0 with this particular choice of the is identical 
with the test oi v > y given in (B) of sectfon 3 Similarly the test of ^ < 0 is 
identical with the test (B) oi v < y. Thus the test (B) has approximately the 
same power for testing v > /i (and p < m) as the Student i-test based on the value 
of ( given in (5) if n > 10 Replacing the in (5) by their values in terms of 
x,yi , • • • , Vn ,n, r, and s, it is found that (5) becomes 


[x - g ~ (v - m )] 

(y. - y)^ 



This proves that test (B) is approximately as powerful for testing v > y. and 
V < ysLSt the most powerful test based on the quantities x, yi , • • • , i/„ if n < 10. 
As test (A) is a particular case of test (B), these results also apply to test (A). 
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ON THE NORM OF A MATRIX 

By Albert H. Bowker 
University of North Carolina 

In studying the convergence of iterative procedures in matrix computation 
and in setting limits of error after a fimte number of steps. Hotelling [1] used 
the square root of the sum of squares of the elements of a matrix as its norm. A 
wide class of functions exists which may be employed as norms in matrix calcula- 
tion and substituted directly in the expressions derived by Hotelling. The 
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purpose of this note is to make a few general remarks about thivS class of functions 
and to propose a new norm which appears to have some value m computation 
A function (piA) of the elements of a real matrix A may be termed a legitimate 
norm if it has the following four properties' 

(1) <#>(cA) = I c 1 <^(A), c a scalar; 

(2) (piA + E) ^ (PiA) + <P(E), if A + i? is defined, 

(3) 0(AB) A if AB is defined; 

(4) <^)(e,j) = 1, where is a fundamental unit matrix 

whose elements are all zero except the one in the tth row and jth column, whose 
value IS unity These four conditions are identical with the first four axioms 
of Ilella [2], who has shown them to be independent. Properties (1), (2), and 
(3) are used directly in investigations of convergence and error, but the im- 
portance of property (4) is indicated by some, of its immediate consequences. 
Clearly e[Aej = a.j , where e, is a fundamental unit vector. From (3) and (il 
it follows that 1 a,, \ A 0(A) for all t and j and we have that 

(5) max(,,) la.jl A 0(A). 

Thus 0(A) has the useful property that the noi'm of a matrix of errors exceeds 
or equals the maximum possible error. Since 0(A'") ^ 0'"(A), it follows from 

(6) that the elements of A”* will i/Cnd to zero as m increases if 0(A ) < 1, a result 
which IS useful in establishing convergence Also 0(A) ^ 0. 

One further consequence of (1) to (4) is of interest. Suppose A is a square 
matrix and let X be any of its roots. Then there exists a non-null vector a: 
such that Ar = Now 0(Xr) - X0(r) ^ 0(A)0 (.t) and we have 


( 6 ) 


X g 0(A). 


Thus, every legitimate norm is an upper bound to the characteristic roots. 
Clearly many functions exist which satisfy (1) to (4) The norm used by 

Hotelling is A(A) = ^ norm which may have some value is 

obtained as follows . 

(7) f2(A) = maX(,)ii.(A) 


where 

BAA) = E I I 

J 

Clearly R(cA) = j c j R{A) To show that R satisfies (2), consider 

R,{A + B) = E I “« + '’v 1 g E I + E 1 I g B(A) -b R{B). 

J J 7 

Since the above inequality holds for all z, 

R{A + S) g R{A) + R{B). 
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Now AB = 11 X) dicJ^aj I 


and 

R^(AB) = 1 ^icc ba3 1 ^ I 0/icc 1 ■ 1 baj | 

j a 2 a 

^ S 1 a*. I R„(B) g RiB)R(A). 

a 

Hence 22 (A5) ^ R(A)R{B). Clearly 22(e„) = 1. Similarly it may be shown 
that C{A) = maX(j) 1 a.j ( also satisfies the conditions of a norm 

t 

Since the convergence of an iterative procedure is often proved by the norm 
being less than one, since the norm appears in the upper bound for the error 
after a fimtc number of iterations, and since the norm of a matrix of errors is 
taken to indicate the magmtude of the errors, a reasonable method of choosing 
among several available legitimate no rm s is to select the smallest It is natural 
to inquire whether an optimum norm in this sense exists, that is, is there a 
function 4>*{A) such that <^)*(A) possesses properties (1) through (4) and such 
tfiat 4>^(.A) g <^)(d-) for all other <^(H) satisfying these conditions. Assume such 
a 4>'*iA) does exist. Clearly <^*(A) = (f>*{A'), as, if either exceeded the other, 
the smaller could bo taken as <i>*(A). Let be the largest root of AA', Then 
by (6) 

S 4>’^{AA') g <I>*\A) and A < <i>*{.A) 

But Rella [2] has shown that A possesses (1) to (4). Thus 

<j>*(A) = A 

But, for a row vector, C'(A) ^ A CoWquently, no mimmal norm exists. It is 

interesting to note that a worst norm does exist, namely P{A) = X 1 1 

\,2 

Since A = ^e^J , 0(A) g P(A) Clearly P{A) satisfies (1) to (4) and hence 

hJ 

is the worst possible legitimate norm. 

In practical computation, the choice so far is between N{A) and 22(A) (or 
C'(A)) No general inequalities exist and it would probably be advisable to 
compute both. 22(A) may be less than iV(A) and indicate convergence when 
N{A) fails to do so Often 22(A) may be computed visually and convergence 
proved mthout computing the sum of squares of the elements. 

The functions N{A) and 22(A) may also be useful in finding a simple first 
approximation to A~‘' A sufficient condition that Hotelling’s iterative method 
for finding the inverse of a matrix A ivill converge is that the roots of 
D = 1 — A Co be less than one in absolute value where Co is a first approximation 
to A~^ If the iterative procedure is to be carried out by a fully automatic 
computing machine such as the one described by Alt [3] it may be advisable to 
start with a rather poor first approximation which is easy to construct If A 
has positive roots and if M is any upper bound to these roots and if Co is a matrix 
with diagonal elements equal to 1/M and zeros elsewhere, the iterative procedure 
will converge but the norm of D will not necessarily be less than one. From 
(6) , any legitimate norm may be taken as M 
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Finally, it is interesting to point out the relation of this note to some work on 
the problem of finding upper bounds to the roots. In fact, the inequalities 
\ < N{A) and X ^ R(A), which are consequences of (6), are Theorem 2 of 
Parnell [4] and Theorem 3 of Barankin [6] respectively. 
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DEFINITION OF THE PROBABLE DEVIATION 

By M. Fa:6cHBT 

Faculty of Science, University of Pans 

The probable deviation has recently been defined by E. J. Gumbel [1], [2] 
as the smallest of the inteivals corresponding to the probability j. It so hap- 
pened that the author was led to an equivalent definition starting from a general 
idea which may be applied to absolutely general oases and which, for this reason, 
might be of interest. * 

In recent years, the author has been occupied with a study of random ele- 
ments of any nature (curves, surfaces, functions, qualitative elements) , a study 
whose future seems promising, [3]. I gave a defimtion of the mean of such an 
element expressed by an abstract integral which, however, is only defined if the 
random element is situated in a metric vectorial (Wiener-Banach) space.’ But* 
a still more general definition is vSid if the random element is placed in any 
metric space. It consists of taking, as mean position of the random element X, 
a fixed (non-statistical) element b = X such that the function of a which rep- 
resents the mean M(X, a)* of the squared distance of X to the fixed element a, 
is minimum for a = h. (In the case where X and a are numbers, and where 
M{Xy is finite, we know that this minimum is reached and that there is one, 
and only one, determination 5 of a). This defimtion has the advantage^! also 
defining tfie equiprobable position of X. This is a fixed element c — X such 
that M (Z, a) is minimum for c = a. (If X and a arc numbers, we know that 
this minimum is still reached, but may be so reached by several values of X). 

Since reading Gumbel’s paper, a still more general definition suggested itself. 


' For the definition of raetrio vectorial spaces see [4], 
“ See Note 2, p 503 of [4] 
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The expressions M{X, a) and ^yM{X, a)* themselves may be considered as 
distances, but as distances of two random elements taken together. To each 
of these distances corresponds as minimum, when a varies, a different “typical” 
function Z or X • • • . Thus, without supposing anything about the space 
into which the different trials place X, avc assume that we have defined a “de- 
viation” of two random elements X, Y taken together We represent this 
function of two random variables by (([Z], [7])), a notation which differs from 
the representation of the distance (Z, Y) of the two positions Z and Y with 
respect to a single trial The lower boundary of the deviation (([Z], [a])), a 
function of a, which is reached for a = Z defines a “typical” position Z. More- 
over, the value of this (([Z], [Z])) may be considered as a measure or, at least, 
as a numerical ranging point of the dispersion of Z. 

Let us abandon these generalities. They hold especially if the element Z 
IS a real valued random variable. Among the possible and reasonable expres- 
sions for the deviation (([Z], [a])) of the numerical variate Z from a fixed number 
a, we may use the equiprobable value of | Z - a ] which may be called the equi- 
probable deviation of Z from a. Thus we have, on one side, a new “typical 
value” of Z which will be a value of a such that the equiprobable deviation of Z 
from a is imnimum, and a new measure of dispersion which is the value of this 
rniTiirmim and which might be called simply the equiprobable deviation of X. 

In the case where Z has everywhere a continuous and finite density ofprdb- 
abihty w(X) we find, as typical value, what Gumbcl calls the “midvalue” 
and represents by 4, and, as equiprobable deviation, what Gumbel calls the 
“probable deviation” and represents by f. 

We may also consider the discontinuous case, which was given as a problem 
to candidates of the “Certificat d’Etudes SupdneuTes de Calcul des Probabihtds, 
Option Statistique Mathdmatique, Session May-June, 1944.” They had to 
solve various questions of which I cite the beginning below'^: 

“Consider n real numbers Xi ^ X 2 ^ • £ x„ and represent, by , a median 

value of the deviations | a;*, - a | of the numbers Xk and a. If a varies, Ea has 
a minirmim E which IS reached by one or several values A of a. 

11 Explain, in a few words, the meaning of the values E and A 

2) For simplicity’s sake, suppose that n is odd (n = 2?’ -|- 1) How should 
E and A be calculated practically? (To find the answer, investigate first how 
E^ varies if a vanes only slightly) . 

3) In the case where n = 4s fi- 3 (s is an integer equal to, or larger than, zero) 
show that E A 

where f?i = > Qs = ®n-« ■” 

The study of this new typical value and of this new equiprobable deviation 
has the advantage that their determination is very rapid and requires hardly 


® Sec the Remark at end of note 
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any calculations. However, Ave have to note an important inferiority of the 
equiprobable deviation of X compared to the mean and the standard deviations 
of X. If one or the other of the last tivo deviations is zero, X is a fixed number 
(except for. the case of the probability zero) This property seems requested 
by the intuitive meaning Avhich we attribute to the dispersion, and to every 
measure or any mark of it . N ow, the equiprobable deviation lacks this property, 
If, for instance, X has only three values; 0, 2, 1, the first two with the probability 
0.249, and the last with the probability 0.502, the equiprobable deviation of X 
will be zero, whereas X ivill be equal to its typical value 1 only with a prob- 
ability of 0.502, and not with a probability equal to unity The same holds 
for any distribution for which there is a point ivith probability exceeding | 
Remark. The definitions of the mean and of the equiprobable position become 
meaningless in the case that M(X, a), or M{X, a)^, is infinite However, we 
succeeded in surmounting the difficulty, and to reach definitions which are valid 
even in this case. If X is a number, the new definitions become equivalent to 
the classical definitions of the mean and equiprobable value. The proofs are 
given in two recent articles [5], [6], 
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THE GENERAL RELATION BETWEEN THE MEAN AND THE MODE 
FOR A DISCONTINUOUS VARIATE 

By M Fh^chet 

Faculty of Science, University of Paris 

Dr Gumbel has pointed out that one of the author’s arguments employed in 
several particular cases (see [1]) can be employed in a general case which includes 
them and leads to the following result: If a statistical variate R has only positive 
entire values differing from zeio, and if its mean value R is smaller than, or 
equal to, unity, the same holds for its equiprobable value R and its mode U 
There are two generalizations of this result which might be of interest : 

1) On the one hand, the author has shoivn [2] that, if a variate R can only 
have values (entire or not) equal to, or larger than, zero, its equiprobable value 
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S IS, at most, equal to twice its mean value R, and the iiiequalitv E/R g 2 
cannot be improved which means that the upper boundary of the first member 
IS exactly equal to (and not less than) two The equality is reached when R 
lias only two values of equal probability, one of them being zero. 

2) On the other hand, if R is an integer positive variate equal to, or larger 
than zero, it can be proven that, if R g <x, we have 

,1) 


Here, R and R stand for the mean and for the mode of R respectively, and a is 
a positive integer differing from zero. For example if R is the number of rep- 
etitions of an event with probability p, we have, for n trials, R = np, whence, 
if a is the first integer number equal to, or larger than, R we have the inequality 
(1) for the most probable number of repetitions Naturally, tlus inequality 
only has an interest if the second member of (1) is smaller than n which means 
that 

a;(a; + 3) < 2n. 


This presupposes 


2n > np{np + 3) 


or 


n < 


2 - 3p 


and, since n must be positive, 

P < § 

To prove the inequality (1), let us wnte w, for the probability that R 
We have 


= V. 


whence 

( 2 ) 

Let the mode be 

then 


^ \ OJy 1 J ^ J VCOy R ~ CX 

0 0 

tf— 1 eo 

(a — r)!!!, ^ X) O' ~ . 

0 


a+l 


R = 


0}^ ^ ^ ' 
and the first member in (2) is bounded by 

a(a + 1) "V' t ^ 

— : ojfi ^ 2-f ~ • 


(3) 


2 
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Now, either a < p ov ^ a. In the first case the second member in (2) leads to 

00 

(4) £ (r ~ “)»>• ^ (P — “)“/3 

since the second member in (4) is one of the terms occurring in the sum. The 
same inequality holds in the second case, j3 ^ a, hence it holds generally. It 
follows from (2), (3), and (4) that 


00 

The probahihty is certainly different from zero, since S = 1- Conse- 

0 

quently 


or 

^ a(a + 3) 

^ - 2 

as stated in (1). 

The equality in (1) is possible only if, from (3), 

— «o) + (a — l)(w^ — wi) + • • • + (wjS — Wa-i) = 0 

and from (4) 

Ua+l + 2£i)a+2 + • • • + (^ — a)u^ + ■ ■ • = (P — oi.)t^i> 

whence 

(5) Wo = 0)1 = • • ■ — ■ = ^a—l 

and 

(5^) W)«+l = ' = 0 . 


The existence of the exceptional case proves that the inequality (1) cannot 
be improved by replacing the second member by a smaller function of a. In 
the exceptional case, the only possible values of li are 

ffi = 0, 1, 2, • • a — 1, a, 

and all values, except perhaps a, are equiprobable. The probability Wa may 
be, but need not be, pqual to o>s . 

Moreover 
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and = a IS possible only if a = /3 = 0 whence, from (5), a)„ = 0 except for 
V = 0 which means that i? only has one value equal to zero. Except for this 
trivial case, we have in the exceptional case /3 > a, and there are a + 2 possible 
values for B. Then we must have 

at 

W/3 = CJo) 53 + CO(J = 1 

0 

whence 

(“ + + Ci)a = 1 

and, from (5), 

a ^ B = ^ y + /Scjfi + = cif I + -i— ^ 1 j a«a 

= Q:[(q! + + Wa) 


whence 

( 7 ) 

From 


follows 


R = at. 


1 = (a + l)wj + «a 5 (a + 2)w„ 


( 8 ) 


ti)a 


1 

a + 2’ 


ijJfi 


1 — Wa 

V+1 • 


These conditions (6), (5')) and (7) are necessary and sufficient for the existence 
of the exceptional case. 

If the equality in (1) is excluded, the mode /3 and the smallest integer number 
a which is equal to, or larger than, the mean, are related by 

(9) p ^ - 1 = + 

2 2 


As shown before, this general inequahty, valid for any discontinuous variate, 
which can assume only non-negative integer values, cannot be improved without 
assuming specific properties of the distribution. 
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note on differentiation under the expectation sign 

IN THE FUNDAMENTAL IDENTITY OF SEQUENTIAL ANALYSIS 

By T E Harris 
Princeton Umvei sily 

Let z bo any chance variable and Si , Z 2 , sa , ■ a sequence of mdepondent 

chance variables, each with the same distribution as z. Let Zk = 2i + 22 + 

+ 2w . Let for all complex t for which the latter exists Let Si , 

S 2 , be a sequence of mutually exclusive events such that depends only 

CO 

on Si , zj , • , Zj , and 52 P(Sj) = 1 Let the chance variable n be defined 

j=i 

as n — j when S, occuis. Blackwell and Girshick [1], generalizing a result 
of Wald [2], showed that if there is a positive constant M such that 

(1) \ < M whenn > N 

then the identity 

(2) = 1 

holds for all complex t for which (j>{{) exists and | 4>{t) 1 > 1. Wald [3] estab- 
lished conditions, including the existence of <#i(0 for all real t, under which 

(2) may be differentiated under the expectation sign an unlimited number 
of times 

Without assuming the existence of <^>{1) for a real f-interval the following result 
holds: 1/ { 1 ) IS true and if E{z‘') and E(.n'‘) are hath finilo, k a positive integer, 
then 

(3) E (<A(*s))-’‘],.oj = 0 

where i = \/—l and s is real. Certain identities, obtained by differentiating 
(2) and piitting t = 0, can also be obtained from (3) For example, if En = 0, 
and if En^ and Ez^ both exist then EZ\ = Ez^En. 

Let Pjf = Pin < N) , pk = Pin = N) Let Hij, Z f) and FiN, Z^) be the 
conditional cumiilatives of Zj and Zr, for n = j and n > N respectively Now 
(2) was derived by Wald [2], p, 285, from a relation, valid whenever exists, 
which in the present notation becomes 

(4) E Pi i<t>it)r’ dHij, Z,) + dFiN, Z^) = 1. 

Examination of Wald’s derivation of (4) shows it to be valid under the present 
hyb>otheses. Now the finiteness of Eiz'‘) clearly implies that of | w = f) 
Also, since F(W, Ztf) is constant outside the interval {—M, M], the integral 

f Z\ dFiN, Zif) IS finite. Hence we may set t — is in (4) and differentiate 
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k times, obtaining for all real s 

(5) 

+ (1 - Pw) £ ^ lC<l>(is)r^l- f_ dF{N, Z^) = 0. 

The derivatives of . ((/)(is))“'^ are sums of terms of the form QiN) {<i>{^s))'~^~’' 
times terms independent of N, where Q{N) is a pol 3 niomial in N of degree < fc. 
For any r < k, 


lim 1 (1 - Pn)N^ 1 

= hm 

N- t p, 

< hm 

t fpt 


If-*cC 

3=W+1 


I=JV+1 


since En’‘ is finite. Hence lim (1 — Pfr)Q{N) = 0 Because of (1) the inte- 
grals in the second term of (5) are bounded as iV — > «> . Now set s = 0 in (5) 
and then let N — ^ “ Since 0(0) = 1, the second term of (5) approaches 0 
and the limit of the first term is just the left side of (3) 

For the case of a Wald sequential process, Stein [4] has shown that all moments 
of n are finite In this case (3) holds whenever Ez'“ is finite. 
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A UNIQUENESS THEOREM FOR UNBIASED SEQUENTIAL 
BINOMIAL ESTIMATION 

By L J. Savage^ 

Umversity of Chicago 

In a recent note [1], J Wolfowitz extended some of the results of a paper by 
Girshick, Mostellcr and Savage [2] on sequential binomial estimation. The 
present note carries one of Wolfowitz’s ideas somewhat further The nomen- 
clature of [1] and [2] will be used freely The concept of "doubly simple region” 
introduced in [1] and assumed there only m the hypothesis of Theorem 3, will 
here be shown to be unnecessarily restrictive. In so doing, we find that sim- 


^Tlie authoi is a Rockefeller fellow at the Institute of Radiobiology and Biophysics, 
University of Chicago 
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plicity is not only a necessary (ef Theorem 4 of [2]) but also a sufficient condi- 
tion that p be the unique unbiased estimate of p for a closed region 

Lemma. If B is simple there is at most one hounded unbiased esLimite of any 
given function of p. 

Proof. If the lemma were false, there would be a non-trivial bounded un- 
biased estimate of zero, i.e., m(ci) such that ] w.(a) | is bounded by a constant 
m*, m{a) not identically zero and E(m(a) | p) = 0. 

(1) Eimiai) I p) = Xl m{(x)k{a)p'‘q^ == 0 

and m(a) not identically zero. Since R is simple we may assume (much as in 
the proof of Theorem 6 of [2]) that we have a boundary point such that 
m(ao) 5 ^ 0, aa IS below all accessible points of its own index and also below 
every other a for which m(a) ^ 0. Therefoic 

(2) lw.(a(i) I kioifjp’''' g'’' = I m(a)k(a)p’' o' \ < m'' 2 k(a)p’'c/. 

v>V(i 

Let M denote the set of all accessible points and boundary points at which 
a; < .uo and = Po + L There are at most a-o points in M, say /9i , ■ ■ , . 

Considering the way in which ao has been chosen, every path from (0, 0) to an a 
for which y > Vo passes through or to at least one point of M. Therefore when 
y > Vi 

P{a) = k(a)p^q’‘ = P(a | M)P(M) 

(3) < P(« I il^) S 

< i: k{^,)P{a I M). 

1 


Prom inequalities (2) and (3). 


(4) 


i(ao) I Kao)?’'”?'” < m) ( D I 


y>vo 




But it is impossible that (4) should be satisfied for small p. 

Combining the Lemma with Theorem 4 of [2] we have the 
Theorem. A necessary and sufficient condition that p{a) be the unique proper 
(bounded) and unbiased estimate of p for a closed region R is that R be simple 
The sufficiency part of this Theorem extends Theorem 3 of [1] from doubly 
simple regions to simple regions. 

The author is indebted to J. Wolfowitz for his valuable suggestions in connec- 
tion with the present note. 
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ACKNOWLEDGEMENT OF PRIORITY 

Bt H. E. Robbins 
Umversity of North Carolina 

At the time of publication of my papers on the measure of a random set 
{Annals of Math. Stat., Vol. 15 (1944), pp 70-74, Vol. 16 (1945), pp. 342- 
347), I was unaware that the theorem on page 72 of the first paper, which 
affords a means of computing the expected value of the measure, had already 
been found by A. Kolmogoroff {Grundbegriffe der Wahrschemlichkeitsrech- 
nung, Ergebnisse der Mathematik, Berlin, 1933, p. 41). I wish to take this 
opportunity of acknowledging Kolmogorofl’s priority, which was pointed out 
by Prof Henry Scheff^. 



ABSTRACTS OF PAPERS 

Presented on January 25, 1947, at the Atlantic City meeting of tlie Institute 

1. A Test of Significance of the Coefficient of Rank Correlation for more than 
Thirty Ranked Items. Nilan Norkis, Hunter College. 

Hotelling and Pahst {Annals of Math. Stat., Vol 7 (1936), p. 37) have suggested the use 
of the Tohebyolieff inequality as an approximation for testing the significance of the oo- 
eflioient of ranlc correlation m oases where the number of ranked items is too large to enable 
exact probabilities to be computed directly A table prepared in accordance with this 
suggestion indicates that for values of the coefficient of rank correlation larger than 50 
there is a wide range of corresponding numbers of ranked items greater than thirty for 
which at least the five per cent level of sigmficance is satisfied. 

For certain types of applications the conservativencss of the Tchebychelf test may be 
a virtue rather than a limitation. 

2, A Generalized T Measure of Multivariate Dispersion. Harold Hotelling, 

University of North Carolina 

The problem of combining errors m two or more dimensions to measure the accuracy of 
firing and bombing is similar to problems occurring in industrial quality control where 
different measures of quality are applied to the same aitiole, and to problems in mental' 
testing and other fields If the covariances were known a priori, the solution optimum 
m certain senses, for a multivariate normal distribution, would be the use of x“ = , 

where [Xq]~^ is the covariance matrix and v, is the deviation in the ith dimension. Since 
the covariances must in all known practical coses be estimated from a preliminary sample 
with (sayj n degrees of freedom, x“ may be replaced by , where [(,/]“* is 

the estimated covariance matrix This is the same T introduced by the authoi in 1931 
as a generalization of the Student ratio I, and has the same distribution. Upon adding 
together the values of T* for different cases (e g for different bombs dropped with the same 
bombsight), a combined measure of over-all excellence (e g of the bombsight), is ob- 
tained Tg like X®, can be bioken down into components meaningful with respect to the 
causal system, specifically in relation to possible sources of excessive discrepancy. Thus, 
if r, is the ih coordinate of the centroid, or mean point of impact, of m bombs, we may 
write T% = , Tj) — 2^0 — Tm Then To is a function only of deviations from 

the mean point of impact Asymptotically (for largo w) , To , Tu and To have the x dis- 
tribution with TO, 2 and to — 2 degrees of freedom respectively Hut the untiustworthiness 
of the X distribution as an approximation is evident even with n os large as 256, for which 
case calculations have been made The exact distributions of To and Td oic ascertained 
when the number ol variates f is 2, and the probability integrals aio expressed as linear 
functions of two incomplete beta functions In fact, Tl/M equals the sum of the roots 
of a determmantal equation of the form | A — A/I | = 0, where A and B are sample covariance 
matrices with n and m degrees of freedom lespectively, and a similai relation holds for To 
with m replaced by m' — 2 To and Tm have the distribution published in 1931 , with prob- 
ability integral expiessible in terms of a single incomplete beta function or the variance 
ratio distribution. It is shown that such parameters as the circular mean deviation are 
best estimated with the help of the T measures, not directly by averaging individual cir- 
cular deviations. 

3 Asymptotic Properties of Maximum and Quasi -Maximum Likebhood Esti- 
mates. Herman Rubin, Cowles Commission for Research in Economics. 
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The results of J L. Doob {Tians Am Math Soc , Vol. 36 (1934), pp 759-775) oncon- 
sistency of maximum likelihood estimates, are generalized and extended to arbitiary mea- 
sure spaces In some special cases, results on asymptotic normality of maximum likeli- 
hood estimates can be gcnoialized to quasi-maximum likelihood estimates (estimates based 
on the assumption of a likelihood function which need not be the true function) 

4 The Asymptotic Distribution of the Range. E. J. Ctumbel, Newark C’olloge 
of Engineering. 

The asymptotic distiibution of the range w for initial unlimited distributions of the 
exponential type is obtained by convolution of the asymptotic distributions of the two 
extremes Let a and ii be the iiarameteis of the distributions of the extremes for a sym- 
metrical variate, and let = 01(10 — 2u) be the reduced range Then the probability 
■id-B) of the reduced langc is subject to the differential equation'!'" -|- 'P' — -i' exp (—R) = 0 
which may be transformed into Bessel’s equation of the first order by the substitutions 
R = 2(log2 — log 2 ), and 'I' = zU . The solution is 'P(B) = zhTifz) foi the asymptotic prob- 
ability, and i//(B) = {z'‘/2)Ka{z) for the asymptotic distribution, Ka{z) and Ki{z) being the 
modified Bessel function of the second kind of orders zero and umty Thus tables of ^(B) 
and. ^(B) may bo calculated for any symmetrical distribinon of the exponential type 
The distribution of the lange w foi normal samples of size 10 is alieady very close to the 
asymptotic distribution provided that the parameters a and u are determined from the 
mean and the standard deviation of the range Tins method permits the calculation of 
the distribution of the range foi normal samples of any size larger than 10 

5 The Comer Test for Association. John W. Tukey, Princeton TTniversity, 
and Paul S. (Jlmstead, Bell Telephone Laboratories 

Construction In a scatter diagram, draw the two medians, that is, the median of the 
X values without regard to the values of y, and the median of the y values without regard 
to the values of x Think of the four quadrants thus formed as being labelled , -f-, — 
in Older, so that the two positive quadrants lie along one diagonal and the two negative 
along the othei. Boginmng at the right-hand side of the diagiam, count in along the ob- 
servations until forced to cross the honzontal median Write down the number of ob- 
servations met before this crossing, attaching the sign, -f, if they lay in the -|- quadrant, 
and the sign, — , if they lay in the — quadrant Repeat this process, moving up from 
below, moving to the right from the left, and moving down from above The quantity to 
be used in the test is the algebraic sum of the four numbeis thus wntten down 

Dislribulion The exact distribution of this quantity when no association is present 
and no two x'e and no two y's are alike is almost independent of sample size over the range 
of values where it is apt to be used For example, a sum of 9 oi moie is expected less than 
one time in ten foi all samples of size 0 or more, a sum of 16 or more, less than one time in 
100 foi all samples of size 10 or more, and a sum of 21 or more, less than one time in 1000 
for all samples of size 14 or more Even for infinite sample size, the sums foi those fi actions 
become only 9, 14, and 19, respectively. 

Extensions. The same ideas that underlie the outside corner test for two variables 
may be extended in several ways to give tests for various types of association among three 
or moie variables. 

6. Consistent Estimates Based on Partially Consistent Observations, with 
Particular Reference to Structural Relations. J Nbyman and Elizabeth 
L Scott, University of California 
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Let [Xftl be a sequence of independent random variables and let F, denote the distribu- 
tion of Zi . Each distribution E, is assumed to depend on unknown parameters. If a 
parameter B appears in an infinity of distributions F, , it is called structural. Otherwise 
it is incidental The sequence jZn) is called consistent if \Fn] has no incidental parameters. 
(Zr) is called partially consistent if (E„) has both structural and incidental parameters,— 
Problem of fitting a straight line when both variables are subject to errors is that of a 
partially consistent series of observations. Let f and i; => a + /3i be two linearly connected 
quantities, perhaps related to particular stars, where a and ^ are unknown The values 
{, and i)i corresponding to the ith star, {i — 1,2, - ■ • , s), are unknown. The observations 
provide measurements a,, of , (j ~ 2, - , mi), and measurements j/,*, (k > 

1, 2, • • , n,) , of 1 ), . Both mi and n, are bounded and small. On the other hand, s may be 
considered as increasing without limit — ^Assume that the Xi, and the j/,* are normally 
distributed with variances and and means fi and o, respectively Then the totality 
of observations will form a partially consistent system with the structural parameters a, ff, 
ffi and a-i and with as incidental parameters.— If the observable random variables are only 
partially consistent, then the maximum likelihood estimates of the structural parameters 
(a) need not be consistent, (b) oven if they are consistent and asymptotically normal, 
alternative estimates may exist which have the same properties but smaller asymptotic 
variances. — Consistent estimates of structural parameters may be obtained from "modi- 
fied” equations of maximum likelihood. The lower bound of the variance of estimates of 
structural paiameters, provided by the Cramdr-Rao inequality, is attained only on certain 
conditions which are both necessary and sufficient. 



NEWS AND NOTICES 

Rccidci's (iT6 invited to subivii to the SecretnTy of the Institute news items of inteTcst 

Personal Items 

Dr Paul H. Anderson has been appointed Economic Analyst with the Market- 
ing Division, Ofhee of Domestic Commerce, Department of Commerce, Wash- 
ington. 

Dr. Gilbert W Beebe is now with the Division of Medical Sciences, National 
Research Council, Washington. 

Professor Plarald Cram6r, Director of the Institute of Mathematical Statistics 
of the University of Stockholm, was awarded the degree of Doctor of Science, 
honoris causa, by Princeton University on February 22, 1947. Professor Cramer 
has acted as Visiting Professor of Mathematics at Princeton University and 
Yale University 4p.nng the academic year 1946-’47 He will be at the Univer- 
sity of California at Berkeley during the 1947 Summer Session. 

Dr. Paul M. Densen has accepted a position with the Division of Medical 
Research Statistics, Bureau of Medicine and Surgeiy, Veterans Administra- 
tion, Washington 

Mr. M. V. Divatia is now in charge of the office of the Statistician and Eco- 
nomic Adviser and Under-Secretary to the Government of Sind, Karachi, 
India. 

Mr. Clarence B. Fme, formerly with the Office of Price Administration, has 
transferred to the Bureau of Old-Age and Survivors Insurance, Social Security 
Administration, where he is employed as a Sampling Expert. 

Prof Charles C Grove was appointed Visiting Lecturer in Mathematics at 
the University of Pennsylvania for the sprmg semester. 

Assoc Prof. E. E. Haskins of Northeastern University has been appointed to 
an assistant professorship at the Army Air Forces Institute of Technology, 
Wright Field, Dayton, Ohio. 

Prof Roger Lessard of the Hull Technical School has accepted a position at 
the Ecole Polytechnique, Montreal 

Mr. Edward D. Lowery is now a member of the Research Department, Win- 
chester Arms Company, New Haven, Connecticut. 

Professor H. B. Mann of Ohio State University has been awarded the Frank 
Nelson Cole prize in the Theory of Numbers for 1946. 

Dr. Margaret P. Martin has been appointed to an assistant professorship in 
the Department of Preventive Medicine and Public Health, Vanderbilt Uni- 
versity Medical School, Nashville, Tennessee. 

Dr. A. L. O’Toole is at present employed by the. Veterans Administration m 
the Washington headquarters, as Acting Chief of the Administrative Analysis 
Division in the Research Service. Dr. O’Toole was released from the Navy on 
September 23, 1946, to inactive duty in the U 8. Naval Reserve, ivith the rank 
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of Commander Di O’Toole served for nearly four jmars in the Xavy, m 
important administrative and statistical work for the C'ommandcr South Pacific 
Area and South Pacific Force He will be remembered as having been with 
Admiral Halsey’s Pacific Fleet, and was awaided the Bronze Stai Medal At 
the time of his release, he was Chief StalT Officer lor Commander South Pacific 
Area and South Pacific Force. 

Mr I B Perrott, since his demobilization from the British Army, has been 
Lecturer m Mathematics at the College of Technology and Commerce, Leicester,' 
England. 

Mr J. S Ripandelli is now with the Actuarial Department of the Jefferson 
Standaid Life Insurance Company of Greensboro, Noitb Carolina 

Dr Ronald W Shephard of the University of Clahfornia has been appointed 
to the staff of the Department of Mathematics, New York University 

Ml, John R, Stehn is no«' a member of the Research Laboratory of the Gen- 
eral Electric Company, Schenectady, New Yoik. 

Dr Charles W Vickery, formerly of Ohio State Univeisity, is engaged in work 
as a Research Consultant in New York City 


Miss Margaret Jeaiinm Dix, of the University of California Statistical Labora- 
tory, died an accidental death at her homo m Berkeley on June 20, 1946 
Mr Albert M Freeman, of the Boston Fiduciary and Research Association, 
died May 20, 1946. 

Dr Walter Schilling, of the Stanford University Hospital, died suddenly in 
San Francisco, December 16, 1946. 


Summer Statistical Session at the University of California at Berkeley 

The important advances in the theory of statistics during the war and espe- 
cially the unprecedented growth in the fields of application have created a 
strong demand for trained statisticians to fill both the research and the teaching 
positions all over the country. Since in many cases the war time education had 
to be somewhat sketchy, unsystematic, and not very conducive to a thorough 
coverage of the va.st material, it is felt that a lelativcly brief set of courses on a 
rather advanced level would be beneficial to many persons, both those ivho ah 
ready hold research or teaching positions in statistics, as well as those who 
prepare for higher degrees. 

With this object in mind, the University of California at Berkeley is offering 
a set of statistical courses during the Summer Session, June 23rd to August, 2nd, 
1947 There will be bhree couises. (i) General Theory of Random Variables and 
Frequency Distributions, by Harald Cramdr of the University of Stockholm,, 
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(ii) Pioblcms of Testing Hypotheses and of Estimation, by J. Neyman, Univer- 
sity of California, Berkeley, and (iii) Seminar Course The last ivill be given by 
seven scholars, each gi^'ing tvo hours of lectuies, as follows: 


1 

2 . 

3 


4 , 

5 , 

6 

7 


Statistical Astronomy 

Orthogonal Polynomials and Problems of Moments 
Methods of Galculal.ion 

(a) Gibbs’ Methods in Statiatieal Mechanics. 

(b) Darwin-Fowlcr Method of Statistics 
Large Scale Sampling Surveys. 

Statistical Problems Arising in Nuclear Physics 
Measurements 

Problems of Population Genetics 

Interactions between Industrial Problems and Mathematical 
Statistics 


J, Tbumpleb 

SZEQO 

F. Lenzen 


P C Mahai.anobis 

R. Seebbr 

S. Emerson 
H. ScHEPFi 


The purpose of the Seminar Course is to introduce the .students either to 
branches of pure mathematics contingent on mathematical statistics but not 
ordinarily taught in the universities oi to vaiious fields of knowledge offering 
fruitful fields for statistical studies. 


Summer Statistical Session at Virginia Polytechnic Institute 

A Summer Statistical Session will be held at Virginia Polytechnic Institute, 
Blacksburg, Virginia, August 5 to September 5, 1947 This Session will be 
sponsored jointly by Virginia Polytechnic Institute, University of North Caro- 
lina, University of Michigan, Iowa State College, and the Federal Bureau of 
Agricultural Economics 

The faculty will consist of, Walter A. Hendricks, B A.E , U S D A , Renis 
Likert, University of Michigan, H L Lucas, University of North Carolina, 
Maurice G Kendall, England; George W. Snedecor, Iowa State College; Frank 
Yates, Rothamsted Expeiiment Station, England, Earl E Houseman, B.A.E , 
U S D A , Raymond J. lessen, Iowa State College, and Boyd Harshbarger,- 
Virginia Polytechnic Institute 

The following courses will be offered lor credit: Engineering Statistics, Sta- 
tistical Methods; Design of Animal Experiments, Schedule Design and Interview 
Techniques for Sample Surveys ; Sampling Design and Analysis , Mathematical 
Theory of Sampling, Seminar, Mathematical Statistics, and Experimental 
Design. 

In addition to the faculty, probable Seminar speakers are: W. F Callendar, 
W G Cochran, Miss Gertrude M Cox, W. E Doming, George Gallup, M. H. 
Hansen, Harold Hotelling, Arnold King, and Charles F. Sarle 

Inquiries legardmg the Siimmei Session should be addressed to Boyd Harsh- 
barger, Professor of Statistics, Summer Statistical Session, Virginia Polytechmc 
Institute, Blacksburg, Virginia. 
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New Members 

The following persons have been elected to membei ship in the Institute 
{January 1 to Fehuary S8, 1947) 

Asofsky, Samuel, B S. (C.C NY) Stat , NationalJowish Welfaie Board, 19S8 B, IS St. 
Brooklyn SO, N Y ’ 

Auer, Richard M., A M. (Columbia) lusLr. in Math , Stato Teachers Coll,, Montolaif 
N. J , SS No. 16 St , East Orange 

Bakan, David, M A (Indiana) Chief Stat , Comm on Soloction and Training of Aircraft 
Pilots, National Research Council, SB9 Natatorium, Ohio Slate Vniv , Columbus 10 
Ohio 

Beatty, Glenn H., AB (Ohio State) Grad student and Fellow, Iowa State College, iSiafoon 
A, General Delivery, Ames, Iowa 

Campbell, Wallace A., B.S (Columbia) Stat .Analyst, War Assets Administration, 4SS 
Washington Are., Brooklyn 18, N. Y. 

Celia, Francis R , MA (Kentucky) Assoc Prof . of Statistics and Director, Bur of Busi- 
ness Research, Univ of Oklahoma, Norman, Okla 

Chapman, Douglas G., MA (Toronto) Asst. Prof of Math., Univ of British Columbia, 
Vancouver, Canada 

Cheydleur, Benjamin F., B.A. (Wisconsin) Chief, Mechanized Analysis, Naval Ordnance 
Lab , 60S Avenue E, District Heights, Washington 19, D, G. 

Coombs, Clyde H., Ph.D. (Chicago) Asa’tProf. of Psychology, and RosearchPsyohologist, 
Institute for Human Adjustment, TJniv of Michigan, Ann Arbor, Mich , 10S7 E, 
Huron 

Corton, Edward L., Jr., MBA. (Chicago) Grad, student, Iowa State Coll., SOS Hodge 
Ave., Ames, Iowa 

Davis, Harold., A B, (Brooklyn Coll ) Stat , Navy Dept., 41S — SS St,, S B., Washington, 
D. C. 

Dutton, Arthur M., B S.E E (Iowa State) Grad. Fellow, Mathematics Dept., Iowa State 
Col]., Ames, Iowa 

Fay, Edward A., AM (Harvaid) Giad. student, Univ. of California, Berkeley, ,^15 SouA 
17th St , Apt SB, Richmond, Calif. 

Flanagan, John C., Ph.D. (Harvard) Prof, of Psychology, Univ of Pittsburgh, Pitts- 
burgh 13, Pa 

Gardner, Eric F., Ed M (Boston Teachers) Teaching Fellow and Milton Fellow, Grad. 
School of Educ , Harvard Univ., Cambridge, Mass , Walker House, Jfi Quincy St. 

Gerende, Lincoln J., C Ph M., U S. Navy, Naval Medical Res Institute, National Naval 
Medical Center, Bethesda 14, Md. 

Grossman, Evelyn, M.A. (Columbia) Stat., U. S Dept, of Agriculture, 6401 — 14 St., 
N. W , Washington IS, D C. 

Hill, Edwin A., Jr., M A. (Columbia) Instr. in Math., Coll, of the City of N. Y., 50 West 
67 St., New York SS, N. Y 

Horton, H. Burke, M.B.A. (Texas) Senior Tianeport Analyst, S906 Naylor Rd., S. B,, 
Washington W, D C. 

Horvltz, Daniel G., B.S (Mass. Stato) Grad, student, Iowa State Coll., ;8fS7' CoMUtri/ C/uft 
Bind., Ames, Iowa 

Ikhtlar-ul-Mulk, S. M., M.A (Punjab, India) Grad, student, Princeton Univ., Oraduate 
College, Princeton, N. J. 

Jaeger, Carol M., BA (Dubuque) Statistician, JSOO Gotunibia Tei race, Peono 5, 111. 

Jessen, Raymond J., Ph.D. (Iowa State) Res. Assoc. Prof , Iowa State College, and 
Agno Statistician, U S.D.A., Statistical Lab., Iowa State Coll , Ames, Iowa 

Klnzer, Mrs. Lydia Greene, M A. (Kansas) Ass’t Instr, in Math , Ohio State Umv,, 
585 East Town Street, Columbus 15, Ohio 
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Langenhop, Carl E., M S. (Iowa State) Inatr. in Math , Iowa State Coll , Apt 3, Cranford 
Annex, Ames, Iowa 

Lowy, Melitta E., A.B. (Hunter) Statistician, Grad, student, Columbia Univ., 645 West 
End Ave , Hew York SB, N Y. 

Mattila, Sakarl, Eil Mag, (Helsinki) High School of Commerce, Helsinki, Finland 

Mayerson, Allen L., B S (Michigan) Grad, student and Teaching Fellow, Univ. of Mich , 
ISOS Packard Si., Ann Arbor, Mich. 

McCreary, Garnet E., M.A. (Queen’s Univ.) Research Fellow, Statistical Lab , Iowa 
State Coll., Ames, Iowa 

McMillan, Olan T., M A (Michigan) Inatr. in Math., Michigan State Coll., East Lansing, 
Mioh. 

Morris, Edward B., A.B. (Indiana) Statistician, U. S. Bur. of Labor Statistics, Eidga 
Place S E., Washington 20, D C. 

Moshman, Jack, BA (New York) Tutor in Math., Queens Coll., Flushing, N. Y., 125-03 
Liberty Ave., Richmond Hill 19 

Natrella, Mrs. Mary G., B.A (Pennsylvania) Statistician, Bureau of Ships, Navy Dept., 
ISIO — ISth St , N W. Washington 5, D C. 

Neal, T. Ellison, A B (Geo Washington) Statistician, Textile Dev Dept., U. S. Rubber 
Co., Hogansville, Ga. 

Noble, Carl E., Ph D. (Iowa) Quality Methods Engineer, Kunberly Clark Corp , Lake- 
view Mill, Neenah, Wis. 

Ostle, Bernard, M A. (British Columbia) Teaching Ass’t, School of Bus. Adm., Univ. of 
Minnesota, Minneapolis, Minn. 

Oxtoby, Toby E., B.A. (Iowa) Grad Ass’t, Dept of Psychology, State Univ. of Iowa, 
Iowa City, Iowa 

Pelsakoff, Melvin P., Student, Princeton Umv., 34 North West College, Princeton, N. J. 

Rothschild, Colette, (Ecole Normale Superieure) Attaches de Recherches au Centre Na- 
tional de la Recherche Scientifique, 4S rue Madame, Pans VI‘, France 

Slonlm, Morris J., M.B A. (Harvard) Statistician, Bureau of Labor Statistics, 210 Wayne 
Place S. E , Washington SO, D C. 

Soler, Reuben I., B.B A (C C.N.Y.) Statistician, Food and Drug Administration, S4B 
Portland St , S E , Washington, D. C. 

Stouffer, Samuel A., Ph.D. (Chicago) Prof, of Sociology and Director of the Laboratory 
of Social Relations, Emerson Hall, Harvard Umv., Cambridge, Mass. 

Telcher, Henry, B.A. (Iowa) Graduate student, Columbia Univ., 139 Osborne Terrace, 
Newark, N. J. 

Tledeman, David V., M.A (Rochester) Instr. m Educ., Grad School of Educ , Harvard 
Univ , Walker House, 40 Quincy St., Cambridge 38, Mass. 

Tlntner, Gerhard, Ph.D (Vienna) Prof, of Economics and Mathematics, Iowa State 
Coll , Ames, Iowa 

Weiss, Eleanor S., Ed M. (Boston Teachers) Teaching Fellow, Grad School of Educ., 
Harvard Umv., S005 Commonwealth Ave , Brighton 35, Mass. 

Wilson, William A., Jr., A B (California) Teaching Ans’t in Psychology, Univ. of Calif., 
Berkeley 4, Calif 

Woodell, Allan D., A.B. (N. Y. State Teachers, Albany) Graduate student in math., Univ. 
of Mich., 4S5 Church Si., Ann Arbor, Mich. 


Omitted from 1946 lists of new members' 

Feraud, Prof. Lucien, Faculte des Sciences Eoonomiques et Sooiales, Univ. de Geneve, 
S4 rue Henri Mussard, Oenive, Switzerland 



REPORT ON THE ATLANTIC CITY MEETING OF THE INSTITUTE 

The Ninth Annual Meeting of the Institute of Mathematical Statistics was 
held at Atlantic City, New Jersey, on Friday and )Saturday, January 24-25, 1947 , 
The meeting was held in conjunction wuth meetings of the American Economic 
Association, American Statistical Association, and the Econometric Society, 
The following 154 members of the Institute attended the meeting. 

Beatrice Aitchison, S’ L Alt, R L Anderson, T W Anderson, K. J Arrow, Max Astra- 
chan, B. M Bennett, Joseph Berkaon, A J. Berman, C. I Bliss, Paul Boachan, A E, 
Brandt, M F Bresnahan, Philip Brown, O P. Bruno, E. W Burgess, 0 K. Buros, B, H 
Camp, F E. Celia, Uttam Chand, K. L Chung, C. W. Churchman, P C. Clifford, W. J. 
CobbjW G. Cochran, F, G. Cornell, D E. Cowan, Haiald Cramer, J II Curtiss, J F Daly, 
G B Dantzig, D G. Delhi, D. B. DeLury, B. W Dempsey, II F. Dorn, F W Dresch, 
A. J. Duncan, David Durand, P S. Dwyer, Chui chill Eisenhart, W. D. Evans, 'Will Feller, 

C. D. Feiris, Irving Fishor, L. E. Frankel, M A Geisler, Leon Gilford, M. A Girshick, 

C. H. Graves, K E Greene, S W. Greenhouse, F, E. Grubbs, E. T. Gumbel, Margaret 
Gurney, Louis Guttman, Trygve Ilaavelmo, K. W Halbert, M H Hansen, Miriam S. 
Harold, T E Harris, Boyd Harahbargcr, Bernard Hecht, Wassily Hoeffdmg, II, B Horton, 
Harold Hotelling, B. E. Houseman, Helen M Humes, Leonid Hurwicz, Seymour Jablon, 

E. W. James, E J, Jessen, H L Jones, Alice S. Kaitz, H. B Kaitz, L S. Kellogg, II, S 
Komjn, Tjalling Koopmans, C. F Kossack, K. L. Kozelka, D. II. Leavens, Howard Levene, 
J E. Lieberman, Eonais Likert, S B Littauei, Irving Lorge, P J McCarthy, P W. Mo- 
Gann, F. B. McIntyre, H. F, MaoNeish, J D. Maddrill, Jacob Marschak, Max Millikan, 

A. M, Mood, Mrs Margaret Moore, J. W Morse, J. E Morton, Frederick Mosteller, D. N. 
Nanda, P M, Neurath, Jerzy Noyman, M L Nordon, Nilan Norris, IT, W Norton, P S 
Olmstead, E G. Olds, Sophie Rakesky, Chester Rapkin, Olav Eeicrsol, W A. Reynolds, 
P. E. Elder, C F Hoes, A. C. Rosander, Ernest Rubin, Herman Eubin, P. J. Rulon, Prank 
Saidel, Marion M. Sandoiniro, Max Sasuly, F, E Satterthwaite, E. D, Schell, E M Schi'ook, 

D. H Schwartz, G E Seth, L W. Shaw, W A Shewliart, J. II. Smith, E. T Smith, Leslie 
E Simon, Milton Sobel, C M, Stem, G. T Steinbeig, Joseph Stoinberg, H. W Steinhaus, 

F. F Stephan, A P. Stergion, M. S Stevens, G. J. Stiglor, S A. Stouffer, Zenon Szatrowslo, 

B, J Teppmg, J. W Tukey, D F Votaw, Ji , Helen M Walker, J. IT Watkins, Louis 
Weiner, Samuel Weiss, S S. Wilks, Elizabeth W. Wilson, C P. Winsor, J Wolfowitz, M A 
Woodbury, Holbrook Working, C A, Wright, and T O, Yntema 

The first session, a joint session with the Econometric Society and the Bio- 
metrics Section of the American Statistical Association, was held at two o’clock 
on Friday afternoon, and was devoted to the topic, A'p'pUcations of Statistical 
Techniques to Agricultural Economics. Holbrook Working of Stanford Uni- 
versity presided. The following four papers were presented: 

1. Use of VanancE Components in the Analysis of Market Differentials in Hog Prices 

E. L Anderson, UniveisUy of North Carolina, 

2 An Application of the Analysis of Variance in the Beonomic Evaluation of Production. 
Boyd Harshbarger, 'Firgima Polytechnic Institute 

3 A Model of the Economic Intel dependence between Agricultw e and the National Econom.y. 
Trygve Ilaavelmo, Cowles Commission foi Research in Economics. 

4. The Reduced-Form Method foi Estimating Simultaneous Economic Relationships 
M, A, Girschiok, Bureau of the Census. 
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The session concluded with a discussion of these papers by T W. Anderson, 
Columbia University; Milton Fiiedman, University of Chicago, and, Harold 
Hotelling, University of North Carolina 
At 8 o’clock on Friday evening there was a joint session with the Econometric 
Society and the American Statistical Association, on the topic, When is the 
Analysis of Variance Useful in Economic Research? Arthur R Tebbutt of 
N orthwestern University presided, and the following three papeis were presented; 

1 The Advantages of the Analysis of Variance for Reseat ch and Managerial Control 
Pw poses Harry Pelle Hartkemeier, Univeisily of Missouri. 

2 Estimation of Economic Relationships and Multivariate Regression 
Leonid Huiwicz, Iowa State College 

3 Nonstandard Forms of Variance Analysis. 

W Allen Wallis, University of Chicago 

There was discussion of these papers by Tjallmg Koopmans, Cowles Commission 
for Research in Economics: Gerhard Tmtner, Iowa State College, and, J. W. 
Tukey, Princeton University, 

At 10 o’clock on Saturday morning theie was a joint session with the American 
Statistical Association devoted to the topic. Use of Ordered Observations in 
Statistical Analysis, 'with. Harold Hotelling of the University of North Carolina 
as chairman The following two papers were presented- 

1. Estimation of Parameters by Use of Order Statistics 
Predenclc Mostellcr, Harvard Umveisity 
2 Tolerance Limits. 

Jacob Wolfowitz, Columbia Umveiaity. / 

There was discussion of these papers by John H Smith, Bureau of Labor Sta- 
tistics, Howard L. Jones, Illinois Bell Telephone Company; and J W. Tukey, 
Princeton University 

At the Saturday morning session one contributed paper of the Institute of 
Mathematical Statistics was also presented, by E J. Gumbel, Newark College 
of Engineering, on the topic The Asymptotic Distribution of the Range. 

The Institute’s session at 2 o’clock Saturday afternoon was devoted to con- 
tributed papers W. G Cochran, president of the Institute, presided, and the 
following four papers were presented 

1 A Test of Significance of the Coefficient of Bank Cor relation for More than Thu ty Banked 
Items. 

Nilaa Nome, Huntei College 

2 A Generalized T Measure of Multivariate Dispersion. 

Harold Hotelliag, Univoisity of North Carolina 

3 Asymptotic Properties of Maximum and Quasi-M aximmn Likelihood Estimates 
Herman Eubm, Cowles Commission for Reseaich in Economics. 

4 The Corner Tost for Association 

J W Tukey, Princeton University, and Paul Olmstead, Bell Telephone Laboratories. 



308 


REPORT ON ATLANTIC CITY MEETING 


Abstracts of these papers appear elsewhere in this issue. 

Pollowmg the session on contributed papers, Professor Jerzy Neyman of the 
University of California gave an invited address on the topic: On Consisient 
Estimates, with Particular Reference to Structural Relations between Several V ari- 
ables all Subject to Random Error A discussion of this address followed, by 
Miss E. L. Scott, University of California; A, Wald, Columbia University; and 
Tjalling Koopmans, Cowles Commission for Kesearch in Economics. 

The meeting closed with the annual business meeting of the Institute, which 
was held at 5 p.m, on Saturday in Haddon Hall. Reports by the President, 
Secretary-Treasurer, and Editor were followed by the election of officers for 
1947: Will Feller, President, Morris H. Hanson and John H. Curtiss, Vice- 
Presidents, and Paul B. Dwyer, Secretary-Treasurer. 

P. S. Dwyer, 
Secretary. 
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Introduction. If n real variables xi, xi , ■ • , are subject to a probability 
distribution with the element dyi(a;i)dy 2 (a:!!) • • dVnixn) one can ask for the 
distribution of any function f of Xt ,xi , ■ ■ • Xn We are primarily interested in 
statistical functions, i.e. in functions that depend on the repartition^ Sn{x) of the 
n quantities Xi , Xs , • • • x„ only The simplest case is that of the linear statis- 
tical functions 


( 1 ) 


f = j dSnix) = + if'W H + tAW). 


The so-called Central Limit Theorem of Probability Calculus states that the 
distribution of a linear statistical function, if n tends to infinity, approaches 
more and more the normal (Gauss) distribution if some very general conditions 
linking 4'{x) and the 7„(a:) are fulfilled. It has been shown, ten years ago, [2] 
that the restriction to linear functions here is immaterial. Much more general 


* The function iS„(s) is called the repartition of the real quantities Xi , , • • , »„ if 

nSn{x) IS the numbci of ilio'-c araons rlie Ti ,xi , ■ ,Xn that are smaller than or equal to x. 
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statistical functions tend towards nonnalcy with increasing n, for example the 
variance of mth order 

(2) / = Mfrt = j (x — a)” dSn{x), ® = / ^dS„(x) 

and, likewise, such combinations as the Lexis quotient M^/ail — a/N) or Gini’s 

dispanty measure 1 — J (1 ~ SnYdx/a or, in the multidimensional case, the 

correlation coefficient, etc. On the other hand, statistical functions are known 
whose distributions assume, asymptotically, a form different from the Gaussian. 
One example is Pearson’s Chi-square, another the test function introduced 
by H. Cramdr [1] and the author [4]: 

(3) / = = I p'(x)[^f„(a:) - fnCx)]' dx 

where g'{x) > 0 and 

(4) V„{x) = - [V^ix] -f + • • • + 7„(x)] 

n 

N. V. Smirnoff [7, 8] computed the asymptotic distribution of a for the case 
that all 7,(x) and, therefore, 'Pnfx) equal one and the same distribution func- 
tion 7(x). The result differs widely from the Gaussian distribution 
In order to understand all this it is necessary to consider / as a function de- 
fined in the space of distributions 7(x) (or in a sub-space of it) . Then, the van- 
able f whose distribution is sought is the value of /{ 7(x) 1 at the “point” £i„(x) 
and should be written as /[iSnCx)}. Such “functions of functions” were first 
introduced by Vito Volterra (1887) and are today a familiar topic of higher 
analysis, The first statement that can be made is that the asymptotic dis- 
tribution of /{)S'n(x) } depends mainly on the behavior of f{V{x) } at the point 
Fnfx) defined by (4). 

Volterra also introduced the notion of derivatives and of Taylor development 
for a “fonction de ligne.” Using these concepts a more specific statement can 
be pronounced: The type of asymptotic distribution of a differentiable statistical 
function /{)S„(x)) depends on which is the first non-vamshing term in the Taylor 
development o//{7(x)} at the point if it is the linear term the limiting dis- 

tribution is normal, under restrictions that can easily be derived from the Central 
Limit Theorem-, in other cases higher types of asymptotic distributions result. 

The present paper tries to establish this theorem and to furnish preliminary 
information about the asymptotic distribution of the second type 

If both the function /(V(x)} and the sequence of distributions Vi(x), V 2 (x), 
V 3 (x), • • • are defined independently of each other, it cannot be presumed that 
the derivative of / vamshes at 7n(x). In this sense the normal distribution ap- 
pears as the “general case” of an asymptotic distribution while the higher type? 
represent certain “singularities ” In the case of type m, (m = 1, 2, 3, • •)> 
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the distribution of the expression 

(5) - flKix)]] 

tends towards a function of bounded mean value and variance. For m = 1 
it IS a Gauss function with mean value 0 and finite variance For any uneven 
m the distribution is symmetrical with respect to the zero point If / is given, 
the limiting distribution ^s essentially determined if in addition to y'„(a;) one func- 
tion of two variables, Un(x, y), is known, 

Unix, y) =11: [VXx) - V,(x)V,iy)], ix S y) 

fl PsbI 

(® . . 

- - 2 [F.w - v,b)Y,m, biy). 

For instance, in the case of the linear function (m = 1) defined in eq. (1), the 
(second order) variance of (5) is found as the Stieltjes integral 

(7) / mKy) dUnix, y) 


and no mean values of higher order are required for computing the moments of 
any order, whatever m is 

For m = 2 the complete expression for the characteristic function of the asymp- 
totic distribution of (5) is developed in Part III of this paper. It has the form 


( 8 ) 


1 

Diui) 


where D(X) is in general the Fredholm deternunant of a sjmimetrical kernel that 
depends on the second derivative of /{F(a;) } at F = ?■„ , on Vn and on I7„ . 
If the Fv(a;) are discontinuous distributions with saltus at k distinct points only, 
D is the determinant of a quadratic form of k variables. This happens to be 
the case with Pearson’s while the distribution found by Smirnoff represents 
a fairly general case of the asymptotic distribution of second type. 


PART I. PRELIMINARY THEOREMS 

1. Asymptotically equal distributions. Let Ki , , K% , • • ■ be an infinite 

sequence of collectives, fcn the number of variables in Kn and A„ , R„ two func- 
tions of these variables, (n = 1, 2, 3, ) The cumulative distribution func- 

tions of A„ and R„ will be denoted by Pnix) and Qn{x) respectively, i.e. 

(1) P„ix) = Prob |A„ ^ x}, Qnix) = Prob g x] 
and the expectation of | A„ — J by 

( 2 ) En{\An-Bn\] 

all these quantities being taken with respect to the distribution in Kn • 
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Two functions and Gn{x) both depending on the parameter n are said 
to be asymptotically equal if 

(3) lim \Fn{x) — Gn(x) I = 0 uniformly in x. 

71=00 

If this is the case for the cumulative distribution functions Pn{x) and Qn{x) of 
An and Bn we shall also say that An and Bn have the same asymptotic distribu- 
tion. Eq (3) will also be written as E„(a;) ~ Gn{x). The following can be 
proved: 

Lemma A. If with increasing n the expectation of the absolute difference be- 
tween An and Bn tends towards zero and if one of the functions Pn(x) or Qn{x) is 
asymptotically equal to a function Fn{x) that has a uniformly bounded derivative, 
i e. 

(4) hm En{\An - Bn\] = 0, < M for all n 

tiboq UX 

then An and Bn have the same asymptotic distribution. 

This statement, in a slightly different wording, was proved in an earlier paper 
[2] and the proof will not be repeated here. If one of the various definitions for 
“stochastical convergence” is used, one can also say that and Bn , under the 
stated conditions, converge stochastically towards each other. 

The Lemma A can bo extended and modified in various ways. First, it is 
obvious that the expectation of | An — iSn 1 can be replaced by that of any 
positive power | A„ — | *‘. With respect to F„ one could ask for the existence 

of a bounded derivative in all points except for a zero set only. Then P„ and 
Qn would still converge everywhere except for this zero set and the defimtion 
of asymptotically equal distributions could be extended to this case. In the 
present paper this will not be done as it is not our purpose to strive for results 
of the possibly greatest generality. 

2. Special class of statistical functions: quantics. Preliminary to the study 
of general statistical functions a special class which corresponds to quantics 
(homogeneous polynomials) of mth order must be discussed. Let Fi(i;), Vi{x), 
T8(a:), • • be the cumulative distribution functions in a sequence of one-dimen- 
sional collectives 0i , , Cg , ■ ■ and Sn{x) the repartition of a sample drawn 

from the n-dimensional collective Kn , with the distribution element 

dVi(xi)dVAx2) ••• dVniXn). 

We introduce 

Tn(x) = s„(x) - V„(x), Vn(x) = - E K(.'C). 

n y-,1 


( 5 ) 
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Here, nTn(x) is obviously the excess of observed values g x over their expected 
number Qualities of first, second, third, • • ■ order are then defined as 

/i{(Sft(x)} = J <^(x)dT„(x) 

(6) M3„(x)} = 11 Hx, y)dT„(x}dT„( 2 j) 

fiiSnix)} = f f f Hx, y, z) dTn{x) dTniy) dT^{z) 

all integrals to be extended over the total range of x. Of course, only such 
for which the respective integral exists are admitted. The first, /i , is obviously 
a linear statistical function and the asymptotic distribution of Vii/i is, under 
well-known conditions, a Gauss function with the mean value zero and the 
variance given in eq. (7) of the Introduction. In /2 , /a , ■ ■ the p may be 
supposed to be symmetrical with respect to their variables. It will be seen 
later (Part II, sec. 2) that the first derivative of /j , the first and second deriva- 
tives of fi , etc vanish at the point F„(x). 

All the above functions /i , /j , /j , can be considered (if the p are continu- 
ous) as the limits of ordinary quantics in k variables Choose k disjoint inter- 
vals h, li , , Ik on the x-axis, and call their complement Denote the 

increment of Vv{x) within by and the increment of )S„(a) by p„« . Ob- 
viously is the probability, within C , , of x falling in the interval h and np„, 
IS the number of observed sample values in the same interval. Wo introduce 
the excess values ; 

1 'A 

(7) Pnx 'Pnk ) pnic ~ ~ ^ ) 'Pvn 

Tl 

and form the sums 

K 1 • Jc Ik 

(8) fi = H pk^H , h = £ Pk\ fx , /a = £ Pkkt- . 

<=>1 K,\ K.Xjfl 

By selecting suitable sets of intervals Ii , h , ■ •, Ik and appropriate values 
for the constants p , , Pi,k , • • , one can approximate the integrals (6) by sums 
of the form (8) . 

Our next task will be to find asymptotic values for the expectation and for the 
moments of the quantities defined in (8) Clearly a formula for the expectation 
of a power product ^^£2^3' where a, |3, 7, ■ ■ ■ are positive integers, is the 
only thing we need. To arrive at such a formula we replace each of the one- 
dimensional collectives Cy by a /c-dimensional C* in the following way. 

In Cy the chance variable is a /c-dimensional vector which can take (k -f- 1) 
distinct values only it can be zero or coincide with the unit vector parallel to 
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one of the h axes. To the latter values of the variable we assign the probabilities 
pyifVyi , ■ ■ • , Pvii and to the zero the probability 

(9) P.,fc+1 = 1 — Prt — Pv2 — • • ■ — p.i 

This quantity, of course, may vanish. The mean value of C* is the point with 
the coordinates p.i , p» 2 , • • , Pwh . 

If the n collectives C* , Ct , ■ ■ , C* are combined, the sim of the n observed 
vector values is a vector with the components npni , npM , ■ ■ , . If in 

each C* the origin is shifted to the mean value and the coordinates with respect 
to the new origin are called zi , 22 , • • ,Zh, the sums of the observed Zi , 22 , • • , 
Zft-values will be n^i , n^i , ■ ■ ■ , nh rather than npni , npn 2 , ■ ■ ■ , np „^ . Thus 
it IS seen that all questions concermng the distributions of Si , ^ 2 , fs , • • ■ can 
be answered on the basis of the well-known rules on the addition of n independent 
chance lariahles. This leads to the symbolic formula for the expectation: 

(10) BA {n^rin^^Yinby ■■ ! = (§ (g (g . 

where on the right-hand side each term 

(11) 


has to be replaced by 


(IT) dV*{«)- 

Here, obviously, ^^(z) is the distnbution function in Ct and the expressions 
(IT) are in fact sums of {k 1 ) terms, for example 


(12) 


J ZiZ2dV*(z) = p„i(l - p,i)(-p,2) + p«(-pvi)(l — pj 


+ Pn( — Pyi)(-~P-2) = -p^lP^l. 
1.^3 


It Will be seen in the next section that only very few of these sums are needed 
for computing the asymptotic value of (10). Note that the value of (IT) can 
be expressed in terms of pn ,, p, 2 , Pi> 3 , • ■ ■ alone if ?i , , £ 3 , ■ ■ • only appear m 

the product. 


3. Asymptotic expectation of excess-power products. We first consider the 
case where the sum of exponents a, /3, 7 , • • ■ is an crew number 

(13) a |3 -H 7 -f • • • = 2m. 

On the right-hand side of (10) stands a sum of terms, each a product of 2m 
factors Z,K ■ It follows from (11') that the absolute value of a product cannot 
surpass 1. The second subscripts are the same m each term; first a ones, then 
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/3 twos, 7 threes, etc. The first subscripts are in each term a combination of 
2m digits out of r = 1, 2, 3, ■ ■ • , n. The number of those combinations which 
include s different r-values, (s = 1, 2, • - 2m), is 


(14) 




- 1)“” + 


+ 


s — 1 


Obviously, the are bounded (independent of n) 

If s > m the combination of first subscripts must include at least one r-value 
that appears only once. All those products vamsh since 


(15) 


J z,dV*{z.) 


= 0 for all K, V 


due to the fact that the origin in the z-space coincides with the mean value of 
the distribution V^iz). Note that 


(16) 



(s < m) 


(s = m) . 


It follows that the sum of all terms in (10) that correspond to any s < m are 
of the order 0(11“) or smaller. 

Thus, we arrive at an asymptotic expression for En by dividing both sides of 
(10) by n"': 

(17) n" • • • } ~ A E (II 

•V K V 


where only such products on the right-hand side are retained w^hich include 
exactly m different v-values each appearzng twice. 

In analogy to (12) we compute 

I ZiZgdV V (2^) ~ PytPi>K (1 7 ^ ^) 

(18) ■> 

= P,.(l — Pyl) (t = k) 


and write, for the sake of abbreviation 

(19) - Pv.p,. = P'J’ 


with the usual meaning of S,, (= 0 if i k and = 1 if t = k). Then the sum 
to the right in (17) includes (2m!)/2’“ terms, each a product of m factors 
If each of the m couples i, k consists of two difierent figures, the respective prod- 
uct appears a! /3' 7' • • times; if r couples are doubles (i = k) the multiplicity 
of the term is 2~’'al pi yi ■ ■ ■ . Therefore, (17) takes the form 


( 20 ) 






aipiy! 


n 


LLL V o-’-pC"!’ 

^ ^ il»fl ^ 


. pC»'w) 
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In this sum the upper indices are any set of m digits out of 1, 2, 3, ■ ■ ■ , « 
and the subscripts are all sets of m couples including a ones, /3 twos, 7 threes 
etc. To each such set of m couples belong (,”) terms of the sum. The number 
of sets of couples is bounded (independent of n) . The exponent r is the number 
of doubles (t = k ) among the m pairs. 

The expression (20) admits of a transforaiation which renders it much more 
suitable Assume that a sot of couples i, k has been chosen according to the 
conditions and consider the product 



Among the n” terms which we obtain by developing (21) are all terms appearing 
in the sum (20), each of them repeated m! times and, in addition, 

(22) n" — (,")m! = n"' — n{n — l)(n — 2) ■ • {n — m + 1) 

other products of m factors P. Since the difference (22) divided by n” goes to 
zero with increasing n and each | P | is smaller than 1 , the additional terms 
have no importance. We therefore introduce the quantities 

(23) p.« = - i: Pi:^ = p.. - - i: Pn p., . 

^ VaX I'm! Th 


Then (20) can be written as 
(24) •••! 

m\ 




... p 




Here we have a sum of a fimte number of terms It will be supposed in all that 
follows that the as defined in (23) do not vanish identically as 11 increases in- 
definitely 

Since in the sum (24) no upper indices appear, equal terms repeat themselves. 
We can, therefore, rearrange it, using the polynomial coefficients and absorbing 
at the same time the factor 2~’^. The final form of (24) is given in the following 
Lemma Bi , which also includes a statement for the case of an uneven sum of 
exponents a d- |S + 7 ffi ■ ■ ■ . In fact, it is easily seen that if again half the 
sum is called m, no group of terms on the right-hand side of (10) exists that 
would supply a fimte limit when divided by n”*. Thus we arrive at 

Lemma Bi . If is the numerical excess of observed over expected quantities 
falling in the interval the asymptotic expectation of the excess-power product 
I 2 ^7 ■ • • is given by 

(a/ ' Pn{fid^7 if a p y -f- • • • uneven 

(25) ~ L (iPu)‘'“(ip22)'“ • • • Pi2"“ Pis'” • • , 

0- (7il 1(722 ’ ' 0‘12 * * 


a + i8 + 7 -f • • even 
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the sum to be extended over all sets of non-negaiive integers an, 0 - 22 , • ■ • , au, • ■ ■ 
that fulfill the condttions 

(250 “'ll = ~ o'i2 — dia — • ■ Oj <i22 = 4(l3 ~ 1121 — o'23 — •••))••• 

The Pi, as defined in (23) depend on two groups of mean values only, namely on 

(25") PK = - S Vo, o.'n-d pip, = - 2 PoiPoK- 

'i'h ft pj =1 

Some properties of the matrix will be discussed m the next Section. 

For practical computation, instead of (25), a recursion formula may be used 
which follows immediately from (24). Writing simply (a, /3, 7 , ■ ■ ■ ) for the sum 
in (24) the formula reads 

(«. |8,7, • ■) = 2 (« — 2, /3, 7, • • •)Pn + P - 2,y, ■ ■ •)P 22 + • • ■ 

(26) 

+ (a — 1, ^ — 1, 7> • • )Pl 2 + (a, p — 1, 7 — 1), • • •)-P23 + ■ ■ 

If all the original distributions V,{x) are equal, this recursion formula, and from 
it (25) , can be derived almost immediately from the theorem on the multiplica- 
tion of characteristic functions ivith the addition of chance variables 
Note that the expectation of the product is Pi^/n for any value of n. 


4. Asymptotic expectation and variance of quantics. We first state a char- 
acteristic property of the expression (25) for the expectation of an excess power 
product. Let us denote by Ca,e,y.-- the right-hand side of (25) in the case 
of even a + /3 7 -f- Then, if Ca,p,y . is expressed in terms of and each 

time the subscript 2 is changed into 1, we arrive at the value of Ca+e.n.y, ■ 
This would not be the case if Ca.s.y. ■ were expressed in terms of p, , since e.g. 

Cii = Pii = Pi — PiPi , Cn = P 12 = — PiP 2 . 

In order to prove the statement we observe that the Ca,g,y, can be derived 
from the coefi&cients in the development of the mth power of a quadric: 


(27) 


(i 2^ P.xf.i.)”' =ml Z) 


Ca. 


s,y, 


a ')0 '7 1 


ti f? tz 


It follows that 


(270 


C 




ml dtididif 




If m the subscripts of P„ the ones and twos are identified, the quadric becomes 
a function oi ti + k , tz , U , • ■ and the derivative with respect to 9 t“ 31^ equals 
the derivative with respect to 9 On the other hand, the latter derivative 

corresponds to the value of Ca+g,o,y, ■ m the form (270 
Taking ?n=2, a = j3=7 = S= l,eq. (25) supplies 

(28) ~ TifP^„ + P.xP,„ -t- P,,P.x . 
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According to the above statement this is correct whether t, k, X, ti are or are not 
different from each other. Thus, if is a symmetric set of constants, we 
have 


In general, the numerical factor to the right, i.e. the number of sets of couples 
drawn from 2m figures, is (2OT)l/2’“m! = 1-3 ••• (2m — 1). Thus we can 
state: 

Lemma Ba . If a qmntic /sm ^s defined according to (8) with symmetric coeffi- 
cients, its asymptotic expectation is given by 

(29) n”^E4U] - 1.3.5 • ■ ■ (2m - 1)E ■ 

Before applying this to the continuous case defined in (6) , let us consider some 
characteristic properties of the matrix P., . According to the definition (19) 
of Pl'lf we have 

(30) E*’ pi:'^ = - (i: f ,Y 

i,* 1=1 \i-i / 


and using (9) one easily derives from Schwarz' inequality 


Since is the arithmetical mean of the Pl"^ it follows that the matrix P,* 
is at least semi-definite and is positive definite except when all p,,h+i = O' 
In the latter case (if e g. the k intervals cover the whole x-axis) one has 


(31) 


l“‘h ^ “ /l / h \^~\ 1 

~ .2-) P**! ( PvL ) = ^ 

t|K ^ V— 1 L \l=»l / J ^ >'“"1 


which shows that here the reciprocal matnx Pf, does not exist. 

In the “complete” case, that is, with all p,,k+L = 0, the elements in each 
horizontal or vertical line of the matrix P„ have the sum zero. It follows that 
the k homogenous equations SP.,a;, = 0 have the solution Xi = X 2 = • • • = a:* 
and, therefore, that the cofactors of all elements of P., have one and the same 
value. For each single v the determinant of P^^J can be computed : 



— PvlPi^ PvhPy,h+l 


If this is applied to the principal minors of the same determinant in the case 
= 0, one finds the characteristic equation of the matrix P["J to be 

I S., - XPi:^ I = - |( [(1 - ^ 

This shows that {k — 1) characteristic roots separate the abscissas l/pu , 
l/p, 2 , • • ■ , l/p,k (one root being zero). 

The number k of intervals has nothing to do with the preceding argument 
leading to the eqs. (25) to (28). Also can the entire computation be repeated 
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in teinis of dTn{xi)f dTn(iK 2 ), dTnix^ , • • ^instead of , §2 j ■ if appro- 

priate differentials are substituted for the Pt, . To find the latter ones -vve note 
that fyt stands for the increment dFv(a;). Thus, using S{x, y) in analogy to 
Si* (= 1 for X = y and = 0 tor x 9^ y) we set 

dU,{x, v) = Six, y) dV^ix) - dV,ix) dV,iy) 

= Six, y) dV,ix) - dW.ix, v) 


(a: g y) 
ix ^ y). 

Then P„ has to be replaced by 

(34) dU„ix, y) = - dU,ix, y) = Six,y) df „ix) - dW„ix, y). 

7h l>al 


which is equivalent to the definition of a fimction of 2 variables; 
Uvi^, V) = - V„ix)Vyiy) = V,ix) - W,ix, y) 

= VXy) - V„ix)V,iy) = Vyiv) - W^(.^,y) 


(33) 


This dUnix, y) is the expectation of dTnix) dT„iy)/n. 
The function 


1 ^ 

(35) ' Unix, 2/) = - S U,ix, y) 

' fv 

/ 

is the difference of two cumulative distribution functions, one corresponding to 
a distribution along the straight line x = y with the element dfni^) and an- 
other distribution over the whole plane with the element 


(350 


dWnix,y) =- ZdF,(x) dVyiy). 

71 vcsl 


To each one-dimensional distributmn V^ix) belongs one “distribution excess” 
U,ix, y) as defined in (33). The P»^ are the increments of U^ix, y) within 
the product interval dxdy. It is seen from the preceding argument that the 
asymptotic moments of any quantic (6) or (8) depend only on the average U„ 
of the distribution excesses U, . 

If a quantic is defined by (6) and the integrals on both sides exist, the asymp- 
totic expectation of / 2 m may be written in formal analogy to (29) as 

n’^En{f2m} ^ 1.3.6 ■ ■ • (2m - 1) J| • • • / ^ixi ,X2 , • • ■ , Xi,^) 

X dUniXl , X2 ) dUniXi , X^ • • • dUnix2m-l ,X2m)’ 

This formula is identical with (29) if ^ has constant values in a finite number 
of intervals and vanishes outside these intervals. But it will be seen in the next 
section that (36) can be used in more general cases also. 

For the sake of practical computation one may develop the righthand side 
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of (36) into terms explicitly depending on tlie given averages V-nix) and W„(x, y), 
For example, in the case m = 3 : 

~ 1.3.5 JJ'J [^{xi ,Xi,X 2 ,Xi,xa, x,) dVr^[Xi) d7„{xa) 

(37) ” 1 ^2 , Xi , xa , Xi) dVrXxi) df^ixa) dWniXa , xi) 

+ mxi , Zi , Xa , Xa , Xi , %) dVn(xi) dWn{xa , Xa) dWniXi , Xa) 

- ipixi , Xa ,Xa,Xi,Xa, xi) dW^{,Xa , Xa) dWnixa , xi) dWn{xa,xi)\ 

In the general case, the numerical factors in the m-tuple integral are the binomial 
coefficients of order m. 

The liigher moments of quantics /„ can be computed in the sa,mp way as 
En{fn} since any power of /„ is a quantic again The formulas, however, be- 
come more involved since the coefficients of are not immediately given in a 
symmetric form. It will suffice to show here how the (second order) variance 
of fa can be found. The second moment is the expectation of 

(39) fl = fill ^(x, y)^(z, u) dT„(x) dT„(y) dTn(z) dT^iu). 

Applying here eq. (28) we have 

n^Enlfl] ~ ll'Pix,y)tp(z,u)ldU„{x,y)dUn{z,u) 

(40) _ . . . 

+ dU„ix, z) dUniy, u) -f dU„(x, u) dUn(y, s)]. 

The first term in the brackets leads to the square of n Enifa] while the second 
and third terms, due to the symmetry of ’if (re, y), supply two equal integrals. 
Thus 


Var ( 71 / 2 } ~ 2 jj rpix, y)\p{z, u) dU„{x, 2 ) dU„{y, u) = 


(41) 2 


Jf 4>ix,x)tl/{y,y) d7„{x) dV„iy) — 2 jj f/{x,y)yp{y,z) dV„{y) 


dW^ix, z) 


+ jj y)'l'{z, u) dW.n{x, z) dW^{y, u) . 

In the same way moments and variances of any order can be computed for any 
quantic /„ . 


6. Final statement on the limit of expectation of quantics. We shall prove 
the following: 
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Lemma Bs . G%ven a sequence of distributions Yi{x), V 2 {x), V»{x) and a 
quantic of order 2m 

/am = jj • " / f'ixi ,X2, ■■■ , X 2 J dT„(xi) dTnixi) • • • dT^ixi^) 


assume that there exist a continuous funcliun-^ix) and a distribution y(^) such that 

\\p{xi,X2,-- ajjJ 1 ^ 'J'C'Ci) vffe) • 

(42) 

d'Vv{x) g dV{x) for | .r 1 > X, v = 1, 2, 3, • ■ 
and that the integrals 

(42') J dV(x), (r = 1,2, • 2m), 


have finite values. Then, for any S > 0 

(43) lim n E„[f 2 m} ~ 0 . 


This lemma, on 'which the mam theorem of Part II is based, will be estab- 
lished if it IS shown, that the formula (36) holds true for functions satisfying 
the conditions (42). 

In the transition from the complete expression (10) for the expectation En 
to the asymptotic value (25) two essential steps were made First, certain 
products of the form (11) have been omitted and, second, certain products 
of as definsd in (19) have been arbitrarily added. This was allowed be- 
cause each of the products was seen to be smaller than 1 and their number was 
of the order 0{n''"~^). If a quantic in integral form (G) is considered which 
involves an infimte number of expressions like (10), a sharper estimate is 
necessary. 

It is easily seen that each integral (11') is a polynomial in p,* including the 
product PriPriPvs • ' and another factor which is certainly bounded whatever 
the PvK are. Thus, if the expectation of ^ 1^2 • fim is computed, each term of the 
form (11') consists of a fimte factor and the product ■ ■ Pv.im • In passing 

to the expectation of the quantic, the have to be replaced by dV^ixf) and 
each neglected term in (10) leads to an expression like 


(45) 


II 


Xi 


■ ■ , Xim) dV,,(xi) dVy^ixi) ■ ■ dVyK(x,). 


According to the assumptions of Bs this integral has a finite value. The num- 
ber of neglected terms being of the order 0(n'"'"^) the omission of these terms is 
justified. 

On the other hand, products of equal, except for the sign, products 
of PnPvK as long as I K and, except for a fimte factor, products of p„ as often 
as I = K. Again it is seen that the arbitrarily added terms sum up ho integrals 
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of the form (45) . This shows that here too, if the conditions of are fulfilled, 
the procedure leading to (25) may be applied. 

It follows that, under the conditions (42), if the integral (42') has a finite 
value, eq. (36) is correct and (43) is an immediate consequence of it. On the 
other hand, it is obvious that weaker conditions than those given in would 
suffice to establish (43) . 


6. Theorem on products of n functions. The principal source of all explicit 
formulas on asymptotic distiibutions lies in certain properties of products of a 
great number of factors. Laplace devoted a part of his fundamental Treatise 
of Probability to these problems, but a complete outline of all results from a 
modern point of view is still lacking. In the third part of the present paper, a 
rather simple statement on this line will be used which may be formulated here as 
Lemma C. Let F,(si , Zi , ■ ■ , 3ji), (y = 1, 2, 3 , ■ ■ ■),be a sequence of analytic 
functions of k complex vanaUes and the product F 1 F 2 • • F^ . Suppose that 
at the point Zi = = ■■ Zh = 0 all F, have the value 1, vanishing first derivatives, 

and the second derivatives 


(46) 

Then 

(47) 


A 


(*•) 

tx 


dZi dz, ' 


lim 




= 0 


uniformly in each bounded region | 2 i | ^ Z in which the absolute values of the third 
derivatives of all F, have an upper bound M. 

In fact, the Taylor development of F, supplies under the conditions stated; 

(48) Fv( 2 i, 22 , • ■ ■ , 2 *) = 1 + i Z) aII^z.Zk + 0{Z'^) 


and, therefore, 

(48') log F,( 2 i, 22 , • • ■ , 2 *) = i Z) ^^’ 2 , 2 , + 0(Z“). 

U,K 

If here all 2 , are replaced by 2 ./'\/ n and the equations added for r = 1, 2 , ■ • ■ ,n 
we obtain 


(49) 


log 


2i 22 2fe \ 1 

fi/n ’ y/n ’ ’ ’ y/n) ~ 2w " 


A 


(») 
I K 


2 , 2 , + nO 



and this shows that the brackets on the left-hand side of (47) are OiZfy/ ri ). — 
It is obvious that (47) would still hold if the condition concerning the third 
derivatives is replaced by a somewhat weaker one. 



DIPFBRENTIABLB STATISTICAL FUNCTIONS 


323 


PART II. DIFFERENTIABLE STATISTICAL FUNCTIONS 

1. Definitions. We consider a one-dimensional cumulative distribution func- 
tion 7(a;) as a point in the 7-space. If two points 7i(a:) and 72(0;) are given 
the functions 

(1) + mix) - 7i(a:)], 0 g t A 1 

represent the straight segment between 7i(a;) and 72(a:) . A subset of the 7-space 
that includes all segments determined by its elements is called a convex domain. 

Now, assume that a sequence of collectives with the distributions 7i(a:), 
72(3;), 73(3;) , • ■ ■ be given. We shall consider functions /[7(a:)} defined in a 
convex domain that includes particularly: (1) all average distributions Vnix) 

(2) 7„(x) = 1 £ 7/a;) 

n v=i 

at least from a certain n on; (2) all repartitions Sn{x) that can occur, i.e, the 
repartitions of n quantities that belong to the label sets of the given collectives 
(e g. positive x, etc.). If 7° (a:) and 7(a;) are any two points of the domain, the 
quantity 

(3) F{t) = /{7”(3:) -h t[V(x) - 7^(3:)] 1, 0 g « g 1 

is a function of the real variable t It will be supposed to admit derivatives 
with respect to t up to the order r -)- 1. 

Following Volterra [9, 10] we define (in a slightly modified way) the derivative 
f of a statistical function / in analogy to the set of partial derivatives of a func- 
tion of several variables. If 7(a;) would stand for a set of distinct variables 
7i , 72 , 73 , • ■ • and V°(x) for their imtial values 7? , 72 , 73 , • • one would 
have 

+ t[V{x) - 7(3:)]},„o = (7, - Vl) 

CLt y OVy 

where 9 //9 7„ is the partial derivative of / with respect to 7„ taken at the point 
7„ = 7y . Thus we write 

(4) 7" (3!) + t[V{x) - 7“(3:)] 1 = Jfl V\x) ,y]d(V~ 7“) (y) 

and call f which depends on 7“(a:) and on a scalar variable y, but not on 7 {x ) , 
the (first) derivative of/{7(a:)] at the point 7'’(3;). Only if a relation (4) is 
fulfilled for any two points of the convex domain, / is called a (one time) differen- 
tiable function. 

The derivative of a linear function 

( 

A = J a(x) dV(x), B = I /3(x) dV(x), • • • 


( 5 ) 
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is simply the factor a{y), /3(j/) ••• respectively, independent of the point at 
which the derivative is taken If/ is given as a function of A, B, • • • one has 

(6) a(|/)i+ ^ I3(y) + ■ • ■ . 

The derivative of the non-linear function 

(7) f=jj Kx, y) dV{x) dV{y) 
is 

(8) r[V\x), y] = I [^(.-r, y) + ,p(y, .-r)] dr(x). 

Note that an additive constant in /' (i.e. a quantity independent of y) has no 
significance since the integral of d(V — F“) vanishes It follows from (6) 
that the first derivative of the mth order variance as defined in (2) of the Intro- 
duction, at the point F''(a) is 

(9) (y - ao)"' - myf(x - ao)""' dV\x) 


where ag is the mean value of 

In the same, way derivatives of higher order can be introduced. The second 
derivative of /{ V {x ) } is a function of i.e. of the point at which the deriva- 
tive is taken, and of two scalar variables y, z which correspond to the two sub- 
scripts in the ease of a function of distinct variables. The definition of 
/"(F(a:), y, z\ is given in the equation 


( 10 ) 


|-,/{F"(.r) -f t[V{x) - F“(a;)]l,.o 


= Jl riV°(x), y, zj d(r - r'’)(y) d(V -F“)(s). 


The second derivative of a linear function is zero. The function (7) has the 
second derivative ^(z, y) + z) independently of F°(a;). The inth order 
variance gives, twice differentiated 

(11) —2mz{y — flo)’"”* + m{m — l)^z J (^ — Oo)"'“ dF°(.x‘). 

The variables y and z in /" or in any additive term of /" may be interchanged 
and a term depending on one of them may be added or omitted. Thus, f' 
can always be written as a symmetric function of y, z without linear terms 
Accordingly, the second derivative of (7) is also 2^{y, z) . 



DIPFEBENTIABLE STATISTICAL FUNCTIONS 


325 


The derivative of rth order of / at the point 7“(a:) will be defined by the 
equation 


+ i[V{x) - y‘'to]h.o 

( 12 ) 

= // ••,?/.} d{v - y“)(7/i) . ■ div - V’‘){yr). 


Here, for given V'Ca:) , may be supposed to be a symmetric function of the r 

variables yi , Vi , • • • , Ur • The rth derivative of the mth order variance is 


(13) 


(-l)^m! 
(m — r + 1) 


yiVi Vr 


X 


(m - r + 1) f (x- dV°(x) ~ S 

J K^l 


(Vk - (to) 


Vk 


Tn-r+n 

— ]'• 


In the case r = m the expression becomes independent of 7°(a;), viz. 


(130 


(-I)”?!!! yiUi • • • 2/m(l - m) 


where terms depending on less than r of the variables yi,yo, • • • ,yr have been 
omitted. 

If the definitions (4), (10), (12) are confronted one can see that/" (7, j/, z} 
is the first derivative of /' { 7, y } etc . For proofs see [9] and [10] . 


2. Taylor development. The function F{t) defined in (3) admits the develop- 
ment 

(14) F(l) - 7(0) = 7'(0) + i7"(0) + ••• +J-,7^^'(0) + (^) 

where is some quantity between zero and one. According to (3) the left-hand 
side equals the diSerence/{ 7 (,x) ) — /{ 7“(a:) } . The expressions 7'(0) , 7"(0) , • • • , 
F^''’(0) are the derivatives as defined in eqs. (4), (10), (12). In the last term 
to the right, one has to introduce the distribution 

(15) 7'(a:) = 7‘’(a:) -|- i?[7(a;) - 7"(a;)] 

and then to take the (r -f l)st derivative of f at the point 7'(x). 

For a given 7“(x) each one of the terms on the right-hand side of (14) is a 
function of 7 (x) . Except for the last one — ^in which & depends m a certain way 
on 7(x) — ^they are quantics with respect to 7(x) — 7'’(x), of the same kind as 
those considered in Part I. (There we had Sn instead of 7 and 'Pn instead 
of 7"). 
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The rth term of (14) can be written as 

(16) ^^ = ^ // ■ ■ ■ . •'t‘= > • • ■ . *'•) W • ■ • diy - F“) (s,) 

where according to (12) 

(160 ,oc,i, ••• ,Xr) = /^’■^(y“(a:), Xi,Xi, ■■■ , 

To find the characteristic properties of we compute its derivatives at a point 
7i(a:). To do this we must replace in (16) the V^x) by 

yiC-c) + mx) - 7i(a:)] 

then differentiate the product 

(17) n d[(7i - 7") (a:,) + t{y - 7a) (rr,)] 

with respect to t, and finally set i = 0. The derivative consists of r terms 
the first of which will be 

d(7- 7a)(a:a)nd(7i- 7“)(a:.). 

lc-2 

Due to the fact that >p may be supposed as a symmetric function, all r terms 
supply the same integral. Thus the derivative of 7, with respect to t at the point 
t = 0 can be written as 

(7^! // ■ ■ ■ / > • • • - *r) d(7 - 7x) {x,) UdiVx- 7“) (a:.) . 

Comparing this with the formula (4) which defines the first derivative of a 
statistical function and writing y instead of x and 7(a:) instead of 7i(a;), we find 

F'r{V{x),y] = 

( 18 ) 

(y^\ II'" I ■■•diV- 7'’)(x0. 

This IS the first derivative of 7,{7(a:)} at the point 7(a:). It vanishes at the 
point Y{x) = 7° (a;). 

The integral in (18) has the same form as that in (14) except that its multi- 
plicity is (r — 1) rather than r. Thus it is immediately seen how the higher 
derivatives of 7, can be found. For the second derivative 7r'{7(a;), y, z] 
we have simply to replace (r — 1) 1 in (18) by (r — 2) 1, then x-t by z and finally 
to omit in the product the differential d{V — 7°) ( 2 : 2 ). This procedure can be 
continued up to the derivative of order (r — 1). The rth derivative, finally. 
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will be 

(19) 2/l J 2/2 , • • • , yr} = , 2/2 , • • • , Z/r) 

independent of Vix) and, according to (160, equal to the rth derivative of 
/{7(a:)} at the point 7° (a:). It is also seen that all integrals of the form(16) 
or (18) vanish if 7(a:) equals 7°(a:). The results can be summarized as follows: 
The sth term, (s = 1, 2 , • • r), of the development (14) is a function of 7(a;) 
for which all derivatives at the point 7°(a;) except that of order s vanish while 
this one equals the sth derivative of the original function /{7(a:)} at 7“(a:). 
The complete analogy of (14) with the Taylor development of a function of 
distinct variables is thus evident. 

If we assume that /{ 7(a;) } is a function whose first (r — 1) derivatives vanish 
at the point 7°(a:), eq (14) takes the form 

V(x) - V\x) =^J I ■■■ //'^{F“(x), 2/1 , 2/2 , ■ ■ ■ , yr} 


■d(V- 7“)(yi) d(7- 7”)(y.) 

( 20 ) 

+ (Fh)l If"' 2/1 , 2 /^ , ■ • • , 2 /,-«) 

•d(F-7“)(yi) •• d(7 - 7“)(yr+i). 


By applying to this formula the lemmas A and B of Part I, we shall arrive at 
the general theorem on asymptotic distributions that is the principal goal of 
this paper. 


3. General theorem. The main result to be derived in the general theory of 
asymptotic distributions is that the so-called normal distribution represents 
the first element in an infimte sequence which includes the asymptotic dis- 
tributions of all differentiable statistical functions, except certain irregular 
cases. The Gauss distribution covers in fact only those functions whose Taylor 
development starts with the first (linear) term, in particular the linear statistical 
functions themselves If the first (r — 1) terms in the development vanish', 
the asymptotic distribution of type r becomes valid. 

Theorem I: Lei Vi(x), V^ix), Vsix), ■ • ■ be an infinite sequence of disiributions 
, and /{7(a;)} a statistical function with derivatives up to order (r-fl). Denote by. 
Sn(x) the repartition of the n label values in the collective with the distribution element 
d7i(a:), dViix) ■■ dVn{x) and by Vnix) the arithmetical mean of Vi{x), 
72 ( 0 :) , • • , 7„(a;). If for large n the first (r — 1) derivatives o//{7(a;) j at the 
point Vnix) vanish and the rth derivative equals f/niyi , 2/2 , ' ' ' , 2 /r), then the 
distribution of 

( 21 ) ^ 


= n^'V{S„{x)] -f{fn{x)}] 
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IS asymftotically equal to the distnhuiion of the rth order quanhc 


^ I J J \pniXl 

■d(S^ - Vn)(xdd(Sn - Vn){xf) 


( 22 ) 

d(Sn - n)(a;r) 

under the following conditions'. 

a) The distnbution of (22) has a uniformly hounded derivative for all n-, 

b) Within a convex domain m the V-space that includes all F„(a!) from a certain 

n on, and all Sn{x) that can occur, the (r + l)si derivative of f {Vise ) } is smaller 
in absolute value than a product 'i'iyi)'i'{y 2 ) ■ • whereby the 

integrals J [^'(a:)]*' dVr(x) for A = 1, 2 , • • • , 2 (r + 1) have a finite upper 

hound for V = 1, 2, 3 , • • • . 

In order to prove this we introduce in eq. (20) S„(a;) for V{x) and y„(a;) for 
■{^"(a;), and multiply both sides by Using the notations (21) and (2) and 
writing Tn for (Sn — F^) , the equation reads 


An - Bn = 


(32) 




(r + 1 ) 


! // ‘ ' ■ / ^ F''(a:), j/i , 1/2 , • • • , i/r+i} dT,fyf) • • • dTniyr-ri) • 


According to Lemma A the theorem will be verilied if we can show that the 
expectation of the absolute value of the right-hand expression in (23) tends to 
zero. 

According to the Schwarz inequality one has, for any real C: 


(24) 


En{\C\} ^ VE„{C^ 


I'd’ fixed values of F,. and S„ the integral on the right-hand side of (23) is a 
quantic of order (r -H 1) with the coefficients i/'r+i(i/i , Vi , • ■ ' > J/rHi). The 
square of this integral is a quantic of order 2 (r -f 1 ) whose coefficients are a finite 
number (depending only on r) of terms each of which is a product of two 
values implying 2(?’ -|- 1) variables i/i , 1/2 , ■ • • , y^r+i) The absolute value of 
these coefficients is, therefore, according to the condition b) smaller than a 
fimte factor times the product A'iyi) t'{yi) ■ ■ ■ 'I'(i/ 2 (r+i)) and thus fulfills the 
condition of lemma B 3 . If the right-hand side of (23) is identified with C, the 
expectation of is, except for a fimte factor, the product of W times the expectation 
of the above-mentioned quantic of order 2(r -t- 1)- It then follows from lemma 
Ba that the limit of En{C^] is zero and from (24) ; 

' hm { 1 Cn I } = hm ( I I } = 0 . 

TiMce tiboo 

This accomplishes the proof of Theorem I. 

If we apply here what was shown in Paa-t I about the asymptotic distribution 
of a quantic, we can also state the following. 
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Theorem II; Under the conditions of Theorem I, the asymptotic distribution of a 
differentiable statistical function f{S„(x) j is essentially determined by 

a) the average distribution Vnix ) ; 

b) the first non-vanishing derivative of /{FC®)) at the point 

c) the average distribution excess 

Unix, y) = Ffl(a:) - - X) Vfix)VM, 

n 

(25) 

= - - E y.ix)Vyiy), 

n p=i 

By “essentially determined” is meant determined except for an additional 
function whose moments of any order are zero. The statement then follows 
from Theorem I in connection with the fact that the asymptotic moments of 
quantics have been computed in Part I from the values of Unix, y) . 

That functions with all moments vamshing, exist has been known for a long 
time. A simple example given by Shohat and Tamarkin [6] is the following. 
Let K be a positive constant smaller than and u = x‘, h = tan kt. Then, 
the density (positive or negative) 

(26) fix) — sin (few) = Im 

fulfills the condition. In fact, the nth moment of (26) is the (vanishing) imagi- 
nary part of the integral 

(27) - [ ^ (cos T . 

K Jo K \ K / 

Since fix) takes negative values of the amount e”" it can be superimposed to a 
given distribution density only in cases where the original density remains 
greater than some multiple of e~“ = exp (— x*') . It can be shown that the moment 
problem is determinate (i.e. the distribution determined by the moments in a 
umque way) if the density vamshes at inhmty at a sufficiently strong degree. 

From the standpoint of statistical theory two distributions with the same 
moments throughout may be considered as equivalent. This justifies the ter- 
minology used in Theorem II. On the other hand. Theorem I is independent of 
this restriction: The asymptotic distribution of the statistical function /()Sn (a;) } 
is under the given conditions identical with that of the corresponding quantic 
of mth order. A detailed discussion of the case m = 2 will be given in Part III. 
Here follow some illustrations for the general case. 

4. Illustrations. The existence of asymptotic distributions of higher types 
can be exemplified in a comparatively simple way if we start from any known 
asymptotic distribution of a statistical function. 

Let us assume that gf { F(a;) } is a function fulfilling the condition 

(28) g{Vnix)} = 0 


X ^ y 
x'^y. 
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for all n, and that the asymptotic c.d.f. lor < 7 {/S„(a;)} is known. There will be 
some positive integer r such that 

(29) Prob [g { SM } ^ ~ «’n(2) • 

If, for instance, gr is a linear statistical function r will be 1 and, under well- 
known conditions, 4>„(a:) a normal (Gaussian) c.d.f. with fimte variance depend- 
ing on n. 

Now, let / be an ordinary function of g and thus another statistical function 
which may be denoted by /{F(.'c)}. According to the rules of differentiation 
we have 

(30) f{V(x),y} =f^g'mx),v] 

and analogous relations can be derived for the derivatives of higher order. In 
particular, the following statement, valid in ordinary differential calculus, holds 
true; If j{'7(a;)) has derivatives of every order and if the first s derivatives of / 
with respect to g vanish at some point g = 0'[Fi('r)) then also the s first deriva- 
tives of / with respect to F(.r) will be zero at V{x) = Fi(a;) . In this way we can 
devise statistical functions, with vanishing derivatives, for which the asymptotic 
distribution is known. 

For the sake of simplicity we may assume that (29) holds with r = 1 and 
that f(g) is a monotonic increasing function, given in the form 

(31) fig) = g’[l + «(?)] 

with s a positive integer, and the inverse function 
(31') gif) =f'‘[l + 0if)] 

where 0if) goes to zero with / -^ 0. Then, from (29) : 

(32) Prob [/(5„(x)j g $„(z') 

if z and z' are connected by 

= rrV’ll -f /3(n-'"^=z)]. 

It follows that 

z' - z^” 0 


and if $„(«') is supposed to be continuous, (32) becomes 

(33) Prob [/(^„(a:) ) g ~ $„(z’'‘) . 

This is a distribution of type s. 

Take as an example for g the arithmetical mean 


6?iS„(a:)} 


z;i + z;2 + - • 


+ x„ 


Q/ti 


(34) 


n 
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where xi , Xn , • ■ ■ , Xn are the observed values and a„ is the arithmetical mean 
of the mean values of Vy{x). Then, under certain restrictions for the Vyix)f 
there exists a bounded sequence hn so that 

Prob[-\Aifl ^ z] ~$n(2) = f du. 

VTT J-» 

Now if we choose 

f = &{g -sin g) = g^ - L + . . 
the asymptotic distribution of / will be given by 

Prob [n\/ n/ g z] ~ ^nC-^) = J du 

with the probability density 

3 V TT 

Similar examples can be drawn from the asymptotic distribution of nx^ if one 
asks for the distribution of appropriate functions of nx‘, etc. 

PART III. SECOND-TYPE ASYMPTOTIC DISTRIBUTION 

1 . Statement of the problem. We now propose to study the asymptotic 
distribution of a quantic of second order as defined in eq. ( 6 ) of Part I. It 
has been shown in Part II that this covers the case of any statistical function 
of which the first but not the second derivative at the critical point vanishes. 

Independently of what was said before, the problem can be stated m the fol- 
lowing way. Given a function ^f/(x, y) and a sequence of cumulative distribu- 
tion functions Yi(a:), Y2(^), ^3(2:) • • Let V'Jjx) be the arithmetical mean of 
Fi(a;), Y2{x) , • ■ • , Vn{x) and Snix) the repartition of a sample zi , 22 , ■ ■ ■ , Zn 
drawn from the collective with the distribution element dVi{zi) ^72(22) 
dVniZn), that is: nSn{x) is the number of those of the observed values 
Zi ) Z2 ) ■ ■ • j that are smaller than or equal to x. Then the quantity 

( 1 ) dTn{x) dTJy), where r„(a:) = - Yj,x) 

is determined by the observations Zi ,Z2, ■ • ■ , z„ . We ask for the distribution 
of / at large values of n. 

Without loss of generality, the function 4'(x, y) can be supposed to be sym- 
metrical. If, in particular, ^{x, y) = f{x)4'{y), the quantity / becomes the 
square of 

j i/r(x) dTjx) = ~ S - f 


( 2 ) 
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and its asymptotic distribution can be computed in the manner shown in the last 
section of Part I. Another example would be 

y) = ?(*) (a; S y) 

(3) “ 

= 3iy) 

In this case, integration by parts shows that 

(4) = f g'{x)Tl{x) dx 


where g' is the derivative of g. This is the statistical function that takes the 
place of X IB continuous problems. See Introduction eq. (3). 

Note that the “excess” ?'„(»:) vanishes at .a; = ± » and that for sufllciently 
large .r the increment dT„(a:) equals —dVnix). Thus, conditions for the exist- 
ence of the integrals in (1), (2), (4), etc. can be expressed in terms of the given 
functions \p{x, y) and F,(a;). 

We shall first study the special case that imphes so-called discontinuous chance 
variables. In our terminology it is the function y) that has to be specified, 
let Ii,h, ••• jhhek mutually exclusive one-dimensional intervals (or groups 
of intervals) and Jm-i their complement Assume that y) has a constant 
value when x falls m I, and y falls in (i, x = 1, 2 , • • , fc + 1) . The increments 
of Sn{^, Tn{x) iti tlio interval 7, will be called pt, fn , S, respectively. 

Clearly, np, is the number of observed values falling in I, , np, is the expected 
number of such values, and n(p, — p,) = nf , the excess of observed over expected 
numbers. Note that the given distributions Vp{x) determine increments p,, 
in the interval 7, and that 


(5) 


pK (Pl« + Pu 4" • • • + Pn«). 

n 


Since the sum of all must be zero we can replace fo+i b> 

( 6 ) h+i = — ^2 — • • — 

Thus, the integral (1) can now be written as a sum of terms 

(7) flSnix)} 

l,K 

like that introduced in the second eq. (8) of Part I. 

Our next task will be to find the asymptotic distribution of (7) which depends 
on the matrix (t, x = 1,2, • • • , k), and on the succession of probability 
values , (r = 1, 2, 3 , • • • , K = 1, 2 , • h). The matrix in k variables 
will be supposed to be symmetrical. 


2. Characteristic function. We define our chance variable as 

( 8 ) x=lf 
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All summations, here and m what follows, are to be extended from 1 to fc if 
not otherwise indicated If Pn{x) is the c d.f . of that is 

(9) Prob||/ ^ a;j = Pnix) 
the characteristic function (c f ) is defined by 

(10) Qn(«) = = f c*“'dP„(rr) 

In order to compute Q„ we assume that the quadratic form (8) is transformedr 
by a linear transformation, into a sum of squares Using appropriate (in general 
complex) coefficients a,* one can write 

(11) .T = (i)i + i;2 + • • • + i/a), i?! = 2 • 

J K 

(The form i/'u is here supposed to be non-singular which, however, means no 
loss of generality), It will be seen later that explicit knowledge of the a,! 
is not needed. 

Now, for any real or complex y, the identity holds' 

( 12 ) 

If we write d for ■\/m and replace in (12) successively y by y-v / n vi > vV' nrji > ■ ■ • 
we find 

(13) s'""' = (2ir) jj ■■ j exp + v\/nJLzic^,] dkdk dh 

where 

(14) ^ ! Zit = y ) y* tf, j z^ ^ it J 1 ; 2, ■ ■ ■ , /b) , 

Since the first exponential factor in the integrand is a constant with respect 
to the chance variable, the expected value of e"”" is given by 

(15) Q„(u) = = (27r)"'''' JJ ■■■ J exp [-iZ) t] G„ dk dk ■ ■ ■ dh 

with 

(16) = U { exp [y V n Z 1 • 

In order to find we consider the following n collectives Ci , Ca , • ■ • , Gn 
with discontinuous, (/c -h 1) -valued distributions: In C, the label values are 
2i , 02 , , 2* , and 3 *+i , with s^+i = 0, their probabilities p,.i , ■ , Vv,h+t . 

The c.f. of this distribution at the point —iol-sj ?iis 

1+1 


(17) 
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If we multiply the n expressions (17) for r = 1, 2 , • • • n the product will be— 
according to well-known rules of probability calculus — ^the c.f. for the distribu- 
tion of the sum of the n label components in the collective formed by combining 
Cl, Ci ■ , Cn . This sum is 


and therefore, 

(18) E |exp = H Vy< J. 

Multiplying both sides of this equation by 

(19) exp[-^rw,] = exp[-|: 

and using the abbreviation 

(20) 2. = S P-.2, ' 


we arrive at 

(21) (?„ = E\ exp {v\/n S ••• E„ 

with 

4+1 

( 22 ) F, = 

«(Bal 

This solves the problem; By inserting (21), (22) in (15) and carrying out the 
integration with respect to tj , < 2 , one has expressed 0„(w) in terms 

of the given p,, and of the coefficients a„ which link the to the t, . This ex- 
pression for Qn{u) holds for all n. 

We have still to show that the integral (15) exists, at least for small | « | or 
1 y I, Independently of the value of n. For this purpose we develop F, , as given 
in (22), in the neighborhood of y = 0. At this point F, = 1 and the first deriva- 
tive vanishes by virtue of (20). We thus have 

,2 h+l 

(23) = 1 + Z Pv,(2k - 

2?i ,_i 

with I d, I ^1, From the defimtion of z, in (14) it follows that the ratio | s, \/T 
with 

= ti + ta + • • • + tfc 

has an upper bound depending on the a,, only. On the other hand, according 
to (20), z, IS a weighted mean of the z, and, therefore, [ z, — 1 will not surpass 

twice the maximum ( z, 1 : 

(25) 


1 z, — 5, 1 < aT 
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wli6re IS a positivG function of the coefficients ai^ which, m turn, are deter- 
mined by the i/'u * Introducing (25) in (23) we find 

I I < 1 + 1.^ I “ gl»“l»3r2/„ 

2n " 

and, finally, from (21) : 

(26) I G„| < 

Thus it IS seen that for 

(27) I w I < 2^2 or 1 — 2a^ | u | ^ > 0 

the integral (15) admits the upper bound 

(28) 1 Q.(n) 1 < (2,r)-'‘'^ jf ■■■ j dk , dk , dt, = 

It also follows that the contribution to (3„(w) from the region T > Ta tends to 
zero with increasing Ta , uniformly with respect to n and with respect to u in 
the region 1 u | < l/2a. 


3. Asymptotic value of Qniu). If the quantity F> introduced in (22) is con- 
sidered as a function of zi/V n, 22 /Vn, • • • , 2;./Vn, 'we may write 

fc+I 

(29) F.(2i,22,-- - ,2.) = 

(C.l 

Here, 2 , is defined by (20) and, on the right-hand side, Zk+i is zero. These func- 
tions Fy{zi , 22 , • • • , 2 *) for V = 1, 2, 3 , • • ■ have all the properties required 
in Lemma C of Part I; At the point zi = = ■ ■ ■ = Zk = 0 one has F, = 1, 
the first derivatives are 

dFy ^ n 

_ = vp,, - vpy, 2 ^ = 0 

02i K_1 


and the second 'derivatives, ( 19 ^ k), 


(30) 



= w^p»(l - p,„) 
- ^ Vvi PvK ■ 


The third derivatives are certainly bounded in any finite region of the g-space, 
and this means also in any finite region of the i-space. 

The matrix of the second denvatives except for the factor is exactly that 
defined in eq. (19) of Part I: 

( 31 ) Fix “ Pvi^iK PviPxX 
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and the arithmetical means of the derivatives from the matrix in eq. (23) of 
Part I; 

1 ” 1 ” 

(31 ) P m ~ ^ 'j PpI ' 'y 'j *PvL 'Pvk ■ 

n ^=1 n i-_i 

Applying Lemma C we find 
(32) G. = Gn 

This is valid in any finite f-region Since it has been shown at the end of the 
foregoing section that, for small | v |, the outside contribution to the integral 
(15) converges uniformly (for all n) towards zero, we are allowed to introduce 

(32) in (15). Writing 

(33) X) Pi, 2 . 2 * = X whereby 7 ., = X pNiia.xa,^ 

l|K 1,IC \,fi 

equation (15) becomes 

(34) Qn (w) ~ (27r)~'^'® JJ •• J exp — f X X Ti, d(i dii-- - dk , 

Now, it is well known that if w„ is any positive definite matrix with the de- 
terminant I m,, I, then 

(35) (27r)"*''^ Jj ■■ J exp [-J X dh dk • • ■ dk = • 

This is likewise true if the matrix m„ , which we also call M, has the form M = 
Ml — XMs where ilfi is positive defirate, ilf 2 arbitrary (complex) and | X | suffi- 
ciently small. Thus, the integration formula (35) applies to (34) and the result 
is reached, for small | m | : 

(36) Q„(u) ~ Q{u) = D(X) = | 5„ — Xy., | • 

If the a,, which transform the given quadric into a sum of squares are known, 

(36) with (33) supply the solution of our problem. 

The formula (36) is susceptible of several useful transformations. Let us 
write A for the matrix a„ , A' for the transposed matrix, and 'T, P, T, I respec- 
tively for the matrices , P.„ 7 ,,, c„. Then, obviously 

(37) t' = A'A, r = APA', M = I - uiT. 


If we multiply M by A' to the left and by A to the right, we obtain 
(38) A'MA = A'lA — m A' APA' A = ^ — m 4'P4'. 
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In this operation the determinant of M is multiplied by 1 Thus I>(X) 
can be written as 

(39) -D(X) = -7 with y[t= 23 '/'iX -Px;! '/'(.« • 

lr‘<l x,)i 

Here, the knowledge of the a,, is no longer required. 

If the matrix (38) is multiplied twice by the inverse of SP, we find ’J'* — uiP 
and, therefore, 

(40) ■D(X) = X Uf, - XP,.l. 

As P is positive definite and '4'* real, it follows that all roots of -D(X) — 
the “Eigenwerte” of r — are real numbers. Therefore, is a regular 

function along the real axis in the it-plane Thus, (36) which was proved so 
far for small | u | only remains valid for all real values of u\ The c.f. of the 
asymptotic distribution is represented by D~^^“(w) for all real li-values. 
Multiplying (38) only once by 4/’* we obtain one of the two forms 

(41) I — ui '4'P or I — ui 
which lead to 

(42) 7)(X) “ I Xs,K j = I XSki 1 , s,* = 23 

Although tills formula has been derived by means of it can be seen by con- 
tinuity considerations that it remains valid whatever the (symmetnc) matrix 
IS The formula makes it clear that the asymptotic distribution of the 
quadric is completely determined by the “Eigenwerte” of the matrix 

8 = 'kP. This bears out our second mam theorem in Chapter II, as far as 
quartics of the form (8) are concerned. It will be seen in sec. 5 how (42) applies 
to the continuous case. 

We, finally, apply to (36) a transformation that is valid only if P has an inverse 
matrix P*. (As shown in Part I, sec. 4 this is not the case if the k intervals to 
which the subscripts 1, 2, ■ • , k refer cover the whole range of the variables 
Xi, Xi, ■ ■ ■ , rc„). Multiplying (41) by P* we find the matrix P* — laT' and 
thus 

(43) D(X) = IP.,1 X IPl - Xi/^ul. 

This is equivalent to 

(44) Q{u) = I P,, 1^^^ If • • / exp [- |SP* 

+ • • • d^h • 

According to the definition of the characteristic function eq. (44) can be inter- 
preted as stating that 

(45) 


|P„ 
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IS the asymptotic probability density for the simultaneous occurrence of t 
expression (45) can be arrived at by applying the Cpn+^^’ 
Limit Theorem to the case of k independent chance variables Since how^' 
P does not exist in general, eq. (44) would not be a suitable point of cw! i ’ 
for developing the theory that concerns us here. ^ 'i^Partiire 

4. Asj^ptotic value of P^x), illustrations. The relationship between thp 
c.f and the c d.f of a distribution is well known and need not be disculd n^ 

Krst the t' section, two aspects of this relationship only 

Tf 0 theorem, first proved by G. P61ya [5], stating that if S 

ci. Q iu) tend towards a limiting function Qiu), the corresponding c d f p S 
tend towards the P(e) that corresponds to Q(i.). Second, the alditmty ” " 
li Qiu) IS of the form aQ'iu) + mu) ,vith a + ^ = 1 then P^t 
“ + PP' i^) with the P'ix), P"ix) corresponding to Q'(u) and 0"iu'\ 

a) Let us first consider a function of two excess values , fa only 
(46) . 


n 


71/ 

B. - c. The prrfu, 


(47) 


BPii 4 - CPn 
PPii + CPis 


APii p PPyt 
-APji 4 " BP22 
and the determinant of Z — 

(48) P(X) = 1 - X[h4Pai 4 - 2 SPi 2 4 - CP 22 ] 4- \\AC - B‘)iPnp 22 - Pu). 
d t' willTe*^' ^ probability density 


(49) 


dPjx) _ 1 
dx 2 v 



e“““ du 

V(‘ 

-m 


Win!^ uT ^^“'ested in the case that P is “complete," i e. a matrix 

with all horizontal and vertical sums vanishing. Then Pn = P,, - P - 

11 . o“; 

i/KA ~ zn -p OpjPj . Here, instead of (49) we have 

A~ 


(50) 


dPjx) - ^ j 


e~““ du 


dx 


27r 


4/1 


w 


y ir s/ X 


This is with respect to V|a:| a Gauss distribution with the variance 

I A - + C I p^y^/2. 
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If, in addition to the assumption that P is “complete” (i.e. in the present case 
that pyi + P ).2 = 1 for all v) the further assumption is made that the two inter- 
vals h and h cover the whole range of the original chance variables xi, X 2 , 
Xi , ‘ , one would have also -h fa = 0 and from (ddi 

rr = I (A - 2B + C)fl. 

In this case, \/l x j is a linear statistical fuhction and the Central Limit Theorem 
leads to the same result as that expressed in (50) . It is seen, however, from our 
derivation, that (50) holds under wider conditions; If p„i ^^2 = 1 for all v, 
there may exist another interval Is within the range of the chance variables 
Xi,X 2 ,Xi , ■ ■ ■ so that fi + f 2 is not necessarily zero. 

The latter remark suggests the following general theorem: If / is a function 
of the k variables fi , f2 , ■ • ■ , ft and g another such function but vanishing when 
^ f 2 _j- ... + = 0, then / and f -p g have the same asymptotic distribution 

provided that for each v the sum + • • • + V^k = 1. In tbe case of 

quadrics this result is equivalent to the following matrix theorem: If P, A 
are symmetric matrices, P with all horizontal and vertical sums equal to zero, 
^ arbitrary, and A of the form Ont = a, -f a, then the two products 

(51) and P(^' ■+• A) 

have the same characteristic roots. — ^This can be proved by the usual methods 
of matrix calculus. The matrix PA has all characteristic roots equal to zero.^ 
h) In the defimtion of Karl Pearson's test function which is usually called 
X, it is presumed that a sample is drawn from the combination of n equal dis- 
tributions. In this case all P'"* are equal and coincide with P which then can 
simply be written P : 


(52) Pi* — PiSiK PtPn • 

The chance variable we now consider will be 


(53) 




n 

2 


ft 


1 2 
A.*./ 


Thus = 5.,/pi and the elements of P^ are 


(630 = S.« - P. • 

The matrix I — \P^ has the elements 


Sn(l — X) -p Xpi . 

If the fcth ’column is subtracted from any one of the others, only two terms re- 
main, one equal to 1 — X and one equal —(1 — X) in the last row. Thus, the 


2 A proof of the matrix theorem has meanwhile been published by Alfred Brauer, Bull, 
Amer Math Soc , Vol 63 (1947), pp. 605-607 
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determinant D{X) includes (/c — 1) times the factor (1 — X) , On the other hand 
-D(X) is of degree [k — 1) and has the absolute term 1. 'Therefore ’ 

(54) Z)(X) = (1 - X)'‘-‘ 

This supplies the x°-distribution with {k — 1) “degrees of freedom” 


(55) 


Q(u) = (1 — m) “ 



(a; S 0). 


Again, our result is slightly more general than that reached in the usual theory. 
It includes the case that in addition to the k intervals with the probabilities 
Pi > Pi , • ,Pk (whose sum is 1) there are other intervals with probability zero. 

On the other hand, if to -x a tcim of the form nS(a, + is added, |this 

would not change the asymptotic distribution. 

One may ask for other quadratic functions of ?i , ^ 2 , • • • , f*, whose asymptotic 
distribution is given by (55) . In particular, one might be interested in a generali- 
zation of X foi' the case of unequal original dislnbuiions The answer can easily 
be given by introducing the cofactors of order (fc — 1) and of order (/c — 2) of the 
determinant 1 Pu 1 . It was mentioned in sec. 4 of Part I that all cofactors of 
order (k — 1) — in the case of “complete” P — ^have the same value It may be 
denoted by A. The cofactor corresponding to the lines t, k and the columns 
X, u will be denoted by with It = 0 if t = k or X = n. Then, if I is any|one 
of the integrers 1, 2, • ■ • , fc 


(56) 


^ fftpltl J If K ^ I 


is one possible solution. In fact, the product F4" has in this case the elements 
(P4^)i, = 5,„ for 1 , K ^ I 

(57) = — 1, “ I = f, K d= Z 

= 0, " K = l 

The determinant of Z — \P^ is then seen to equal (1 — X)*’~^. 

The solution (56), however, is unsymmetrical in the sense that it does not 
include any terms with A completely symmetrical solution in which all 
f play the same role is given by 

1 * 

(58) M ^ 

According to (57) the matrix P4> now consists of terms {k — 1) /k in the prin- 
cipal diagonal and — 1/A; at all other places, that is 

1 ' 
k ' 



(580 


(Z^)., = 
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In the_same -way as in the case of (53') it can be seen that the determinant of 
I — \P'^ equals here (1 — X)*' ^ The asym/piotic distribution of with 

the coefficients (58) is, therefore, the -distribution with (fc — 1) degrees of freedom. 

If the formula (58) is applied to the case of equal the corresponding 
quadric becomes 





that IS, x” + a term vanishing with |i + ^2 + • h- One can easily modify 
(58) so that it leads to without any addition. 

c) A third group of examples where the asymptotic density is expressed by 
simple functions is that where Z)(X) is an exact square, that is, all characteristic 
roots (except the one that is zero) have even multiplicities Let us assume k = 
2m + 1 and let Xj , Xo , • , Xm be m double roots. Then 


(59) 

with 

(59) 



and therefore 


(60) 


dPix) _ 2:A„X, 


li 


dx 

Assum e, for instance, that all original distributions are uniform, that is 

pW _ p = i 5 _ L 

1 IK — -Tik ^ OtK 

and that the quadric / is given in the form (11) with the following a,, : 
a.* = for i = 1 

= Vfrci ” t > 1, K = 1, 2, ■ • • , I - 1 

= — (i — 1)'\/A'C, ” i > 1, K = i 

= 0 ” i > 1, K = i + 1, i + 2, ■ • • , ft). 

Then, the 7 u as defined in (33) become 

7i« = Cii(i — 1 )^ 1 * for 1 or K > 1 
= 0 

and £)(X) according to (36) takes the form 

1 

(63) D(X) =16..- X7.K I = n [1 - ?^c,i(i - 1)J. 

i <=2 


a; ^ 0. 


(61) 


(62) 


« 1 
t, = K = 1 
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In other terms, for the quadric 

/ = kci{^i + • • • + hY + fcc2(fi — I2)** + + £2 — 2^3)^ + • ■ • 

+ kck[^i + f 2 + • ■ ■ + ^k-i — {k ~ 1)^,]'® 

the characteristic \-values are 1/Cii(i — 1). 

Now, to obtain the case of m double roots with /c = 2m + 1 we have simply 
to choose 


C 2 — 3c3 , 3c4 — ' 5c5 j hcc — 7 Cl , * • * * 

The first term on the right-hand side can be entirely omitted in accordance to 
what was said in connection with (51). Besides, for the same reason, the ex- 
pression can be simplified in various ways by assuming + ^2 + • ■ • -t- S*; = 0. 
As a numerical example, take A: = 5, C 2 = 3, cs = 1, c = 5, cs = 3. Then 

/ = 20(fi + ^2 + fa + 20 + 20 fs ■" fif2 — ~ fafi + 10 ^ 4 ^ 5 ) 

leads to the characteristic values X = 1/6 and 1/60 and the asymptotic density 
becomes 


dP 1 y _a/60 
^ = 54^® 

In a similar way other groups of quadrics with asymptotic distributions of 
the type (60) can easily be constructed. One may, for instance, use eq. (41) 
and make vamsh, in the matrix )S = all elements on one side of the diagonal 
so that the roots are immediately known. 


6. Transition to the continuous case. In this concluding section, the transi- 
tion to the case of a quadric of the form (1) with continuous 1 /' {x, y) will be 
outlined. The formula best fit for this purpose is eq. (36). We therefore 
suppose the statistical function / given as * 

(64) / = J J y) dTn{x) dTr,{y) with ^{x, v) = J “(?■, a;)a(7-, y)_dr. 

In analogy to (33) we derive 


nix, y) = J J a(x, s)a(y, t) dU„is, t) 

(65) 

= J a(x, s)a(y, s) dV„(s) - J J a(x, s)a(y, t) dWn{s, t). 

Since dW is symmetric, this function y{x, y) is symmetric ivith respect to x and 
y. If B(X) denotes the Fredholm deterimnant of the “kernel” 7 (.x, y), we con- 
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elude from (36) that the characteristic function of the asymptotic distribution 
of / will be given by 


( 66 ) 


Qn(u) 


1 

i>{m) 


if certain convergence conditions are satisfied 

In order to establish (66) the mam point is to find a sequence of functions 
y)> i/), ■ ■ each of the type considered in the foregoing Sections and 

such that 1) the distribution of the quadric /*, with the coefficients \pi tends to- 
wards the distribution of f with increasing Ic and independently of n, and 2) that 
the determinants Di corresponding to converge towards D as h increases in- 
definitely Using our Lemma A we can replace the first condition by asking 
that the expectation of j / — /i | should go to zero with k — i «> independently of n 
The following assumptions shall be made concerning / and the 7p(a) The 
function ci{r, x) in (64) is continuous and bounded in every fimte region, there 
exist two positive continuous functions a(r) , fi{x) such that 

(67) 1 air, x) | g ai(r)|S(x) 


and that the integrals 


(68) J a^(r) dr = M, J )3(x) d7,(x), J ^\x) dVyix) 


exist, the latter two being bounded and converging umformly with respect to 
V. We are going to devise a step function ^jt(x, y) so that for the corresponding 
fj, and any positive €i 

(69) ^ { 1 / “ 1 } = «! > 

Let N be an upper bound of the integrals 

(70) J ^(x) dVyix) S N, J /3(x) dVn(x) g N 

and e = ei/(5 -f- 8 W). Choose a value L such that 

(71) [ pix)dVyix) [ p\x) dVy(x) ^ 

and, calling B the maximum of fiix) in | x | ^ L, another quantity B such that 



We subdivide, in the x-y-r-space, the domain \ x\ ^L,\y\^L,\r\^Bm 
equal cells where k is determined by the condition that the absolute value 
of the variation of air, x)air, y) within each cell does not exceed e/4B. Outside 
this domain we set ikir, x) = 0 while inside the domain akir, x)akir, y) shall 
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equal the value that a{r, x)oi(r, y) assumes in the center of the respective cell. 
Then y) will be defined by 

(73) ij) = J at(r, x)ai:(r, y) dr. 

From the definition of k and from (67) and (72) it follows that 

I y) - y)\^ I a{r, x)a(i, y) - oii{r, x)ctk{r, y)\ dr 

(74j + [ \a{r, x)a{r,y)\dr 

J|r|>R 

^ + p{x)m [ «'(r) dr ^ i = e 

as long as I aj I ^ Z, I y I ^ B. If this square is called (L) and the comple- 
mentary region (L) we have 


(75) 


f - fh = [ [ y) - Mx, 2/)] dT«{x) dTn{y) 
J j(£,) 


+ 


[[ I^(x, y) dTn(x) dTn(y) 


and since the integral of | dTn(x) dT„(y) | is not larger than 4, while, according 
to (64) and (67) 

(76) 1 i^(a:, y) 1 g P{x)p{y) J a(r) dr = Mp{x)p{y) 
we conclude from (74) and (75) 

(77) \f - fk\ ^ 46 + M [ [ p{x)fi{y) \ dTn[x) dTniy) [. 

This gives 

(78) E{\f - Ml i4e+M f [ fi(xMy}E{ldTn(x) dTniy) |}. 

J J(i) 

Now, from | | = | dSn — dV„ [ g dTn + 2dy'„ and from the formulas 

derived in Part II, 


it follows 


E{ dTnix) } = 0, E{ dTnix) dTniy ) } = - dU nix, y) 

Tb 


E{ f dTnix) dTniy) I } ^ - dUnix, 2/) + 4 dVnix) dVniy) 

71 


(79) 
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With 

(79') dU^{x, y) = Six, y) dVnix) — dWrix, y) ^ S(x, y) dfnix). 
If this IS introduced in (78) and (71) taken into account, wc find 

EWf-M) ^4:e + M- [ M dVnix) 

n J\x]>L 


(80) 


+ 4M 


f f /3ix)l3(y) dVnix) dVJy) 
J JcZ) 


^ 4e + - t + 4 X 2Ne ^ (5 + 8N)€ = 


n 


as required m (69). 

On the other hand, it can be seen that the kernel 7 ( 0 :, y) as defined in (65) 
is the limit of the sequence yuix, y) 


(81) 


ykix, y)= akix, s)akiy, t) dUnis, i) 


= 0 


for X, y in (7?) 
for X, y in iW) 


where (J?) means the 
region. In fact, from 
m (R) . 


region | x j i R,\y\^ R and iR) the complementary 
the definition of k and eqs. (67) and (71) one has for a:, y 


(82) 


I yix, y) - ykix, y) \ ^ ^ 1 dUnis, t) 1 

+ f [ 1 aix, s)aiy, t) dUnis, t) I 

J Jii) 

+ ccix)aiy) r f I3\s) dVnis) 

+ -E ff. P(s)m dV^is) dV.it) 

71 J (.L) 


+ ccix)My)^il + 2N). 

Since a(x) is bounded, the right-hand side goes to zero with «. Finally, for 
X, y in (E) we have 

I 7 (x, y) — ykix, y) \ ^ j J i otix, s)aiy, t) dUnis, i) \ 

g aix)aiy) [ I fi\s) dVnis) 

+ - E f [ ^is)Pit) dV.is) dF.(«).'| 
n J J J 


(83) 
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Here, the tAvo terms in the brackets are bounded, but a(x)aiy) goes to zero as R 
increases The conclusion is that 74(2,, y) tends uniformly towards y{x, y) 
Avith b — > °o. 

Thus, eq. (6G) is established provided that the function 7(.r;, y) defined m (65) 
has a Fredholm determinant D(X) that is the limit of the corresponding alge- 
braic determinants and provided that the c.f leads to a c.di, with 

bounded derivative. 

As an example let us consider the case 

air, x) ~ s/g'ir) ior r ^ x 

(84) 

= 0 “ r < X. 


This function is not continuous as it was assumed in establishing (66) . Hoav- 
ever, the existence of a single discontinuity line, x = r, does not invalidate the 
argument, We assume g'ii-) = 0 and equal to dg/dr Then, in the case of 
(84): 


(85) 


<Pix, y) = J air, x)air, y)dr = — giy) for x ^ y 

= - gix) 




Since, hoAvever, adding to 1// a function of x or of y alone does not change the 
value of /, Ave can also use 


(850 


'Pix, y) = gix) for X ^ y 

= giy) “ X ^ y. 


The statistical function / that corresponds to (84) can be computed either from 

(85) or (850 — or directly from (84) if Ave use the formula that follows from (64) 


dr. 


(86) I = j I a(r, x) dT„ix) 

The integral in the brackets is, in our case, seen to equal \/ g'ir) T^ir), thus 

(860 / = / 9 '(r)[S„(r) - 7„(r)fdr. 

This IS exactly the test function mentioned in the Introduction, eq. (3). 

To find the distribution of / we have to compute 7(3;, y) Its definition (65) 
can be Avritten in the form 

(87) 7(0:, y) = ^ S J a(x, s)aiy, s) dVyis) - J aix, s) dK(s) J a(y, s) dVyis) 
This supplies in the case of (84) 

/QOS = y/g'ix)g'iy)[fnix) - VJx)Vr.iy)] for x y 

= V g'ix)g'iy)\7„iy) - Vnix)Vniy)] “ x ^ y. 
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Here, the second term in the brackets is the arithmetical mean of the products 

V.{x)VXy). 

If the distributions Vp{x) are all equal (independent of v) we have simply to 
write 7(a:) instead of ^(a:) and Y{x)Y{y) instead of YJ^)Yj^) If, m addi- 
tion, the distribution in the original collectives are uniform in the basic interval 
0 to 1, one has 

7 (a:, y) = V g'{x)g'{y) a: (1 - y) for 0 g x g y S 1 
^ ^ = Vg'(a:)y'(y) y(i - ^) “ 0 g y ^ x g i. 

This is the case dealt with in Smirnoff’s papers [7, 8]. If, finally, g'ix) is sup- 
posed to be equal to 1 in the interval 0, 1, we arrive at a kernel y{x, y) whose 
Fredholm determinant is well known: 


(90) 


y{x, y) = a:(l - y) for x g y 

= y(l - t) “ x'^y 


D{\) = 


sin 


Vx 


Vx 


This supplies immediately the c f and (in form of a definite integral) the c.d f 
of the asymptotic distribution of u for y' = 1 
The same result can be reached without the use of a(r, x) if we apply one of 
the transformations discussed in the foregoing Section Take, for instance, 
instead of y{x, y) the unsymmetnc kernel c{x, y) corresponding to the matrix 
S = defined in (41) If all original distributions are equal, the element of 
B can be written as 

(91) Su — P Pli'Plit ~ ^ 

I /* 

Calling v{x) the density dY{x)/dx in the continuous case, the corresponding 
kernel becomes 

(92) (t{x, y) = v{x) ^{x, y) - j <l^{s, y)v{s) dij 
With the i/'-values from (85'), y' = 1, w = 1, this gives 

X y 


o{x,y) = a: - y -f ^for 


(920 


1 _ 

2 


X ^ y. 


' It can easily be seen that the “Eigenfunctions” of this <r(x, y) are sin(\/Xm x) 
with Xm = ?nV“, and, therefore, the Fredholm determinant is that indicated in 

( 90 )- • . . . 

It might be added that the expectation and the asymptotic variance of u 
can be computed, independently of the distribution, from the formulas de- 
veloped in Part I. The results are 


(93) 


nE{o,^j = f g’(x)Y„(x)ll - ^(x)] 


dx 
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and, in the case of all Vv{x) equal 

(94) n^Varlw**) ~4 jj g'(x)g'(p)V^(a;)ll ~ V(y)fdxdy. 

*gi/ 

These formulas have already been given in [4]. 

Another, more general, remark is this If all F,(a;) are equal, one can reduce 
the problem, by a transformation of the original chance variable x into x' = 
V (.t) , to the case of a uniform distribution over the interval 0 to 1 If the y„(x) 
are not equal, it might still be possible to find a transformation x' = a;'(x) such 
that all original distributions extend over a finite region on the j:'-axis only, 
In this case the restrictions concerning the behavior of the distributions at 
infinity drop out, 
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APPROXIMATE SOLUTIONS FOR MEANS AND VARIANCES IN A 
CERTAIN CLASS OF BOX PROBLEMS 

By Philip J. McCarthy 
Social Science Research Council 

1, Summary. Consider n boxes, each box haying an associated probability, 
pi ) (2 P' “ l) ) associated integer, fc, . If balls are thrown one by one 

into these boxes, the probability being p, that any one ball falls into the ^th box, 
then the number of balls which must be thrown in order to obtain, for the first 
time, at least ki, balls m the iith box, at least balls in the ^ 2 th box, • * ■ , and at 
least /ct, balls m the i,th box. is a random variable, Nsihivi), h{pf), ■ , /c.,(p„)]. 

Here , fa , ■ • , represent the numbers of that set of s boxes, (1 < s < ti), 
which first satisfies the stated condition. 

The distribution of Ns{ki{-pf), k 2 ipi), ■ , kniPn)] can be written down for any 
set of values assigned to n, s, the pfs and the ki'& However, for n greater than 
2 the distribution assumes such an extremely complicated multinomial form 
that except for certain special cases even the mean of the distribution cannot 
be numerically evaluated without a prohibitive amount of labor. 

This paper presents the exact moments of iVi[fcj(pi), kfipf)] and Ar 2 [fci(pi), hipi)] 
in forms that readily lend themselves to computation and shows how these 
moments can be used to obtain approximate values for the mean and variance 
for certain situations where n is greater than two These approximation formu- 
lae are given for 

1. The mean and variance, for any n and any set of fc.’s and p,'s when s = 1 
or n, 

2. The mean, for any n and 2 < s < n — 1, when pi = l/n, k, = fc, 

{i = 1,2, ■ , n). 

Some indications are given concermng the error of the approximations, and the 
circumstances which lead to a mimmum (and maximum) error. Curves have 
been prepared to show the mean for the two box case, the primary function 
of these curves being to assist in the apphcation of the approximation formulae 
Some problems where the results of this paper might be applicable are suggested 
in the Introduction 

2. Introduction. A box problem is defined when one is given a fixed number 
of boxes, a collection of balls (either finite or infinite) , a set of rules governing 
the throwing of the balls into the boxes and a statement of the conditions which 
will bring the throwing to an end. The terminating conditions usually state 
either that a fixed number of balls will be thrown or that balls will be thrown 
until a particular distribution of balls in the boxes has been obtained. In the 
first of these, interest is centered on the possible distributions which can be ob- 
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tained, while in the latter the number of balls necessary to obtain a specified 
distribution is of primary interest. 

This paper will be concerned with certain problems falling in the latter cate- 
gory. In the simplest case one is given two boxes with associated probabilities 
Pi and Pst and associated integeis h and h ■ Balls are thrown one by one into 
the two boxes, the probability being pi that any one ball goes in the first box and 
P 2 that it goes in the second box. This process is stopped when either ki balls 
fall in box 1 or h balls in box 2, whichever occurs first. One is interested in the 
distribution of the number of balls necessary to terminate the throwing. This 
problem was stated in essentially this form by Laplace [4], but he contented 
himself with merely writing down the probability generating function. 

Here the special case of two boxes will be treated m detail and the results 
will then be generalized to the rt-box case. In all of these instances it is pos- 
sible to write down exact expressions for the mean and variance of the nnmber of 
balls required to achieve the stated distribution. However, in almost every 
case the resulting expressions are too complicated to be of any use when a numeii 
ical answer is desired. The prmcipal portion of this paper will be devoted to 
obtaining approximate formulae from which numerical answers can be obtained 
for these problems. Some evaluation of the degree of approximation will be 
given in section 5, while curves to facilitate the computation will be given in 
section 6. 

The statement of these problems in terms of boxes and balls may lead one to 
the belief that they have no other interpretation. Actually this is not the case, 
and a few illustrations of this point will now be given For example, consider 
the curtailed single sampling plan used in acceptance sampling. A buyer re- 
ceives a lot of articles This lot will contain a certain proportion of defective 
items. The buyer wishes to determine on the basis of sampling whether to 
accept or reject the lot. His knowledge of his own situation will allow him to 
specify the largest proportion of defectives which he is ordinarily willing to 
accept and the risk he is willing to take of accepting a lot with a proportion de- 
fective larger than this critical proportion. On the basis of these two values it 
is possible to set up a sampling plan in which the buyer will take a sample of size 
n out of the lot, inspect it, and reject it if there are ki or more defectives in the 
sample. Of course once he has obtained ki defectives there is no need to inspect 
the remainder of the sample, The lot will then be automatically rejected. 
Similarly, once he has obtained n—ki non-defectives, he can accept the lot with- 
out inspecting the remainder of the items. The average number of items which 
he must inspect in order to reach a decision is given by the solution to the two 
bbx problem stated above. Box I will receive the defective items, the asso- 
ciated integer being ki and the associated probability being pi , the true propor- 
tion of defectives in the lot. Box 2 will receive the non-defective items, the 
associated integer being n — h and the associated probability being , the true 
proportion of non-defectives in the lot. 

Laplace [4] considered problems of this type as applied to games of chance, 
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Thus suppose there are two players A and B who participate in successive trials 
of a given event, the probability being pi that A wins on any one trial and p2 
that B wins. Then one can associate the integer ki with A and with B by 
saying that A wins the match if he wins fci trials before B wins trials and con- 
versely. The analysis is exactly the same as for the two box problem It is 
apparent that this same situation can be extended to any number of players. 

Another possible interpretation is as a particular kind of random walk prob- 
lem. Let a particle start at the origin of a system of rectangular coordinates 
and suffer successive positive unit displacements, the probability being pi that 
it moves one unit in the ^-direction and p2 that it moves ope umt in the y- 
direction. Furthermore assume that it is absorbed if it ever reaches the line 
X = h,i or the line y = fca . Then the analysis of the above two box problem 
gives the mean number of displacements before it is absorbed. In the same 
manner, such a random walk problem can be stated for n dimensions For n 
equal to three, there will be three planes and the particle will be absorbed when 
it reaches any one of the three. 

Certain problems in public opinion polling may fit into this category of box 
problems, particularly if the above problem is rephrased so that one requires 
the mean number of trials to obtain at least ki balls in the first box and at least 
hi balls in the second box, for the first time. For example, suppose that one 
desires to sample from a population composed of two types of individuals, 
A and B Let the population proportions of A and B be known and be de- 
noted by Pi and p2 . Then if one wishes to obtain at least fci individuals of type 
A and at least ki individuals of type B, the average number of persons who must 
be chosen in order to fulfill this condition is given by the analysis of the cor- 
responding box problem. This is rather artificial when there are only two cate- 
gories and pi -f p2 = 1 . However, these restrictions will be removed m the 
course of the paper, and the problem will be considered for any number of types 
of individuals 

As a final example, consider one of the many bombing problems which arose 
during the course of war research. Suppose that a factory which is to be de- 
molished has n vital units, the destruction of any one of which will destroy the 
usefulness of the factory Let the probability be pi of hitting the first umt with 
a single bomb, p2 the probability of hitting the second with a single bomb, etc., 
and assume that ki bomb hits will fimsh off the first umt, ki , the second, etc. 
Then the mean number of bombs required will be given by the analysis for the 
corresponding box problem. 

Corresponding interpretations are possible for the other problems which are 
to be considered in this paper. Some of these will be indicated as the analysis 
proceeds and it is to be hoped that others will occur to the reader. 

As previously noted, this paper will be concerned with the distribution of balls 
necessary to terminate the throwing, assuming the p’s are known. Another 
possible interpretation is to assume the p’s unknown and to estimate them with 
the results of the ball throwing. Certain aspects of this problem for two boxes 
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have been considered by J B. S Plaldane [3] and Girshick, Hosteller and Savage 

[ 2 ] 


3. Solution for the two box case. 


3.1. DistiihuUon and moments of the number of trials necessary to obtain either 
ki balls m the first box or hi balls in the second box. This problem may be stated 
as follows: Suppose one is given two boxes with associated probabilities pi and 
Pi , and associated integers hi and h . For the present it will be assumed that 
Pi Pi = 1, although this restriction will be removed later Now let balls be 
thrown one by ope into these two boxes, the probability being pi that a particular 
ball will fall in the first box and that it will fall in the second box This 
process is stopped on the first ball which leaves either hi balls in the first box or 
hi balls in the second box. The number of balls, x, which is reqmred to accom- 
plish this IS a random variable and we desire the moments of x The probability 
that hi balls are obtained in the first box on the a;tb throw, hi< x < hi + hi — 1, 
before hi balls are obtained in the second box, is immediately seen to be 


(3.1) 


a; — 1 
hi - 1 




1 { X - i 

J = U - 1 


pi' pr*’' 


Similar reasoning gives the probability that fe balls are obtained in the second 
box for the first time on the .rth throw, h x < hi hi — 1 , as 


(3.2) Gt-i) 

From (3.1) and (3.2), the hih. moment of x, Bix’'), is 


Ai+7fl2— 1 / -j \ 


^1*4* A. 2 — 1 

E 


- 1 
- 1 


pr^^pi'. 


However, it is inconvenient to consider (3 3) directly. A much simpler pro- 
cedure is to determine the increasing factorial moments of x and then transform 
these into the ordinary moments Thus the Mh increasing factorial moment of 
a;, Fh,i[hi{pi), hipi)], is defined as E[x(x + 1) • • (x + h — 1)]. Then Fh,i[ ] 
IS equal to 


(3 4) 


Asi+fea— 1 

E 




(a; -f- /i — 1) 1 / a: — l\ 
(a? - 1)1 Vci — 1/ 


pi'pr"' 


+ 


E 




{x + h — 1)1 { X — l\ 
(a;-!)! \ki - 


ki 

Vi 


(3.4) can be transformed by means of the relationship 

E ^ Pi = (1 - -f 1, a -bl). 


(3.5) 
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where hiV: s) is Incomplete Beta-Function as tabulated by Karl Pearson 
[6], and the result is obtained that 


F,AkM). Uv.)\ - l.,(fe + h, fe) 

Pi 


The ordinary /ith moment of x may be written in terms of Fi,i[ ], F 2 ,i[ ], • • • i 

Fh.A ] 

(3 7) E{x'^) = 

where A*0'“ represents a difference of zero. Tabular values of are given 

by Fisher and Yates [1]. 

In particular, the mean and variance of x, which will receive the special desig- 
nations Eiihipi), hipi)] and fefe)] respectively, are 


(3.8) 

ki j 

— 1 1 
PV 

niki d- 1, fe) + — Iptih + 1, ki) 

P 2 

and 




ki{ki -h 1) T 

2 

+ 2, fe) -t- 2 Ipiikz -b 2, hi) 

(3 9) 

Pi 

Pa 



— Eilkiipi) , hipz)] — {A'i[fci(pi), /caCpa)])* . 


In the event the p’s are equal and sum to one, Fi[/ci(pi) , hipi) ] will be abbreviated 
to Ei[ki , * 2 ], and finally, if both the p’s and fc’s are equal, it will be written as 
Ei[k\ In this two box situation, the only other possibility is Eiihipi), hipi)], 
which will denote the expected number of balls required to obtain at least h 
in the first box and at least ki m the second box, for the first time This problem 
will be considered in section 3.2. 

In order to facilitate the computation of mean values, both for the two box 
problem itself and for its application to problems involving a larger number of 
boxes, (3.8) has been graphed for various values of fci , ki , pi and pz . A dis- 
cussion of this procedure and the results obtained will be found m section 6 

There is one further result which will later prove useful. Consider the situa- 
tion when there is only one box with pi and hi , pi < 1 This is the same as 
having two boxes where the kz corresponding to the second box is mfimte In 
other words, one can teiminate the throwmg of balls only because of what hap- 
pens to the first box, never because of anything that happens to the second box. 
In this case one obtains 


Si[fci(pi), <=0 (P 2 )] 


= E 


/ T — 1 
^\ki-^ 


p^pr'^ 


Pi ■ 


(3.10) 
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Similarly, 

(3.11) (ri[/Ci(Pi), «> (P 2 )] = 

Pi ■ 


3.2. Distribution and moments of the number of throws necessary to obtain at 
least hi halls in the first box and at least fe balls in the second box. This problem 
may be stated as follows; Suppose there are two boxes with associated probabil- 
ities pi and Pa , and associated integers h and h . As in 3 1, pi -|- pj = i, 
Let balls be thrown into the boxes one by one, the probability being pi that a 
particular ball will fall in the first box and pa that it will fall in the second box. 
Tins process is stopped on the first ball which leaves at least hi in the first box 
and exactly h in the second or at least Ic^ in the second and exactly hi in the first. 
Again x is the number of balls required to accomplish this. As explained in 
3 1, the mean value in this case will be written as E^lkiipi) , hipf)]. The analysis 
follows through as in 3.1 and the mean number of trials is equal to 


(3.12) 


00 





Pi' Pa 


x-^ki 


to 



*-*,j ij 

Pi Pa 


Maldng use of (3.5), this can be written as 

(3'.13) [1 - I,, (hi + 1, h)] + - [1 - Ipiiki + 1, Ai)]. 

Pi Ps 

Referring to (3.8) it is evident that 


(3.14) Ei[ki(pi) , kilpf)] Ei[ki{pi),k2(pi)\ = — -f- — . 

pi Pa 


The hth increasing factorial moment in this problem, denoted by Eh.ilkfipi), 
hiPi)], is 


Hh + 1) ■■■{ki + h-1) 


(3.15) 


P'l 


fi- 


ll - Jp,(/Ci fi- h, h)] 
kijki -[- 1) • • • (frg fi- h — 1) 


Pa 


[1 — Ipniki fi- h, fci)] 


Comparison of (3.15) with (3.6) gives the relationship 

(3.16) 

. Hh -b 1) ■ • ■ (A-2 fi- h - 1) 

+ 

The ordinary moments of x can be computed from (3.15) by the use of (3.7). 
That is, formula (3.7) holds in this case if ] is replaced by P’1,21 ]. 
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It can be easily shown by the use of the recursion relationship for the Incom- 
plete Beta-Function, 

q) = - 1, q) + (1 ~ x)I^{p, q - 1), 

that Fh,i[ ] and Fkfi[ ] satisfy the partial difference equation 

P'h.Akiipi),h(p2)] = hFh^i,,[ki{pi), hipi)] 

(3.17) + piFh.r[{ki - l)(pi), hipi)] 

+ PaFft.d/ciCpi), ik2 - l)(p2)], 

where ^ = 1 or 2. This equation can be used as an alternative way of obtaining 
many results, examples of which are (3.10) and (3.11). Certain of these apph- 
cations have been discussed by McCarthy [5]. 

4. Solution for the n box case. 

4.1. Prehrmnary discussion. The problems of this section, although direct 
generalizations of the two box cases, can perhaps be most easily stated and 
illustrated as applied , to the behavior of a random particle Suppose that 
we have a random particle which starts at the origin of R-dimensional rectangular 
coordinates and moves in umt steps along the positive coordinate axes. At 
any given point the probability will be taken as p, that it moves in the a;,-direc- 

n 

tion. 23 P» assumed to be one unless otherwise specified Now consider the 
1-1 

n hyperplanes, Xi = K , and assume that the particle will be absorbed if it passes 
through a specified number, say s, of these hyperplanes. Notice that we are 
interested only in the number of planes which it passes through, and not in the 
particular ones. For each s, (s = 1, 2, , n), the number of moves which the 

particle makes before it is absorbed is a random variable, and in this section we 
will be concerned with the distribution of this random variable. The cor- 
responding interpretations for boxes and balls is immediately obvious. 

These problems are seen to be generalizations of the two box cases considered 
in section 3. Although it is always relatively easy to write down formal ex- 
pressions for the quantities to be considered, the step from two boxes to three 
or more boxes produces expressions which are extremely difficult, or even im- 
possible, to evaluate. In this section we shall develop approximate solutions 
which make use only of simple computations based on the solution for the two 
box case. 

As an introduction to the contents of this section, we shall discuss briefly a 
box problem which is a special case of the general problem. Assume that there 
are n boxes with a probability of 1/n that any one ball will be thrown into a 
particular one of the n boxes Then one can ask for the mean and variance of 
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the number of tnals required to obtain s occupied boxes (i.e. ki = k^ '— ... 
An = 1). Making use of (3 10) and (3.11), we obtain 


A[l"l = 1 



E.[l”] = 1 + 


n — 1 


+ • • • + 


n 


n — s + 1 




,0 n — z 


and 


<r?[n = 0 

Call"] = 0 + (Ti 


cUn = 0 + 




+ 


{n — 1)* 


+ ffl 


in - 1)^ 

bm-is 


(4.2) 




in - 1)2 ' in - 2)2 


■rnn 


= 0 + 


in - 1)= 


+ 


2n 


{n - 2)2 

+ ■ ■ • + 


is — 1)r 
(n — s + 1)2 


j-i 

= zi 23 


in — z)2 


The solution for this problem for s = n is given in Uspensky [9], but a straight- 
forward solution requires a great deal of formal manipulation. The step-by- 
step procedure used here is somewhat indicative of the methods to be used in the 
succeedmg portions of this paper. 

4.2. Mean and vanance of the number of tnals reqmred to obtain either hi balls 
in the first box, or hi m the second, • ■ , or An-i in the in — l)st, the probability 
associated with the nth box being non-zero. The mean number of trials in this 
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particular problem ie represented by , fc„_i(p„_i), oo (p„)]. The 

formal expression for this quantity is 


(4.3) 


-n — 1 00 

t=l 


0 - 1 )' 




ih ~ 1)! (j - kd I 


xL, 


(j - fr.)! 


FI ' Fi-l Pi+1 Vn J 


n I* • -n-i !n+i i- • i 

where the third sum is taken over all values of the r’s such that 




l"! + • ■ • + f'i-l + >*<+1 + • • • + Pn = J — fc, 

and 

^ J ' * ' I ^1—1 ^ ki — I ) Pi+i ki-^\ 5 ' ' ' J Tii~\ ^Ti— 1 . 

This expression can be reduced by one dimension by the application of some of 
the results for two boxes Consider for the moment only those balls going into 
the first {n — 1) boxes. Then the number of balls (conditional) which is neces- 
sary to obtain either /ci in the first box, or in the second, • • • , or fc„_i in the 
{n — l)st box IS a random variable X which takes on values 


ki , ki + 1, • • • , ki-\- 1 c 2 • • • -f kn-i — (n — 2) 

with corresponding probabihties ir, , where with no loss of generality it is as- 
sumed that ki < k 2 < • < kn-i . x, is given by a sum of (n — 1) multinomial 


/(M4 


expressions, the probability associated with the ith box now being p 

which will be designated by p', . 

Under these circumstances it is apparent that 

(4 4) Eilhipi), • ■ ■ , fc„-i(p„_i), oo (p„)] = -I + p„_i), « (p„)J. 


However, (3.10) can be applied to each term in (4.4), leading to 

1 

(Pl + P2 + • + Pn-l) , 


(4.5) 


^ ^ TTy Xj . 


Now from the definition of w, and x,- we have 

ElihiPl), • ■ ■ , fc„-l(Pn-l), 00 (Pn)] 

(4 6) 1 


(pi -f- P2 -b • • • + p„-l) 


Ei[ki{p'i), kiivi), , fc„-i(p^,-i]). 


Similarly, the application of (3.11) gives the result that 

<ri[/Cl(pi), •• , kn-liVn-l) , (Pn)] 

(^’' 7 ) Pn 


(Pl + Ps + ■ ■ ■ + Pn-l) 


Eiihip'i), • • • ,,S:„-i(p(._])]. 
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These results are of immediate importance for two reasons: 

1. They indicate that by combining boxes and introducmg a new random 
variable, certain problems can be simplified. This statement will be expanded 
and the principle applied repeatedly in the later portions of this paper. 

2. With respect to the section on two boxes, they mean that the restriction 

Pi + = 1 IS not necessary for the solution of the problems. One can always 

assume that p 3 ( = 1 — pi ~ pa) refers to a box which receives balls but which 
otherwise has no effect on the outcome of an experiment In this paper it has 
been convenient to refer to such a box as having an infinite capacity. 

4.3. The mean value and variance of the number of trials required in a two box 
problem when one or both of the constants ki and lc 2 are replaced by random variables. 
The discussion in 4 2 has indicated that the idea of associating a random variable 
with a box instead of a single integer may sometimes lead to simplification 
Here this procedure will be treated in more detail. Consider Ei[ki{pi), /c 2 (p 2 '] 
and assume that h is replaced by a random variable X which can take on values 
ail , . 1 : 2 , • , Xi with corresponding probabilities tti , • • , ir, , • ■ ■ , Xi . Under 

these circumstances Ei[ ] itself becomes the random variable Ei[X(pi), hip^)], 
taking on values EiMpi), hipi)], (^ = 1 , 2 , • , f), with corresponding prob- 

abilities X, . The mean value of this new random variable can be formally 
written down as 

t 

(4.8) S(Hi[Z(pi), 7 c2(p2)]) = YjViEi[xfpf),h{p%)]. 

tMl 


This expression can always be calculated from the probabilities xj and (3.8) 
or from the curves given in section 6 . However, m the applications which will 
arise later in this paper, this computation would be very time consuming. In- 
stead, an approximation to (4.8) will now be derived which will prove to yield 
very good results, and which can be obtained by a simple reading on the above 
mentioned curves. 

If X IS regarded as a continuous variable, then Ei[X{pf), Itiipf)] is a con- 
tinuous function of X, and, in fact, can be represented by a single curve similar 
to tho.se appearing in section 6 . Moreover, as is apparent from (3 8 ), repeated 
differentiation of Hi[X(pi), ^ 2 ( 7 ) 2 )] yields continuous derivatives. Consequently, 

Ei[X{pO, 7 c 2 (p 2 )] can be expanded in Taylor series about a, where o = ^ viXi . 

i=>l 

This procedure gives 

(4.9) EiE^[Xipt), = i x. i: Ellaip,), h(p,)], 

1-1 ,_0 3 ! 

where El[a(pi), h.{pf)] represents the yth derivative of Ei[X{pf), h^ipf)] with 
respect to X evaluated at a. Interchanging the order of summation one ob- 
tains 


(4.10) 


-a)' 

j -0 3 1 1-1 


f 
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The final result then becomes 

(4.11) E(EAx(p,) , Up,)]) = E , 

3-0 J 1 

where n, is the jth moment of X about its mean, a. Thus to a first approxima- 
tion 

(4.12) E{Ei[X{p]), Ic 2 (p 2 )l) Ei[aipi), Up,)]. 

It is of interest to note that if Ei[X(pi), fe!(p 2 )] is linear in X then (4 12) is an 
exact expression since all derivatives except the fii'st are zero. Furthermore, 
if Ei[X{p{), kiip,)] IS of the second degree m X, then only the second non-zero 
term on the right hand side of (4.11) needs to be added to (4 12) in order to 
make it exact. The former of these is the relation which gave an exact solution 
in 4.2. 

It IS important to realize that this analysis for E{Ei[X{pi), Upi)]) can be 
immediately applied to E(E 2 [X{pi), Upi)])- For, by the use of (3.14) and 
(4 8), one obtains 

(4 13) E(E2[X(p,) , hip,)]) = ^ -h ^ ~ EiEAXip,), Up,)]) 

Pi Ps 

The same analysis can be applied to P'a,i[ ] and the general result obtained 
that 

(4 14) EiFUXip,), Up^)]) ^ FkMPi), kM 

This immediately allows one to approximate the variance in the obvious manner. 

It is of interest to consider briefly the situation when both k, and k, are re- 
placed by random variables Let h be replaced by X, taking on values xn , 
Xvi , • • ■ , xii with probabilities xn , inz , ■ , xi; and k, be replaced by X, 

taking on values x,i , x,, , ■ • ■ , x,^ with probabilities xai , X 22 , • • • , ir ,, . Then 

(4.16) EiEilXiipi), Xiip,)]) = E xuX 2 jSi[a:u(pi), x,^{p,)], 

»i7 

where i = 1, 2, ■■■, t and j = 1, 2, ■ , s Again applying Taylor series 

and expanding about o = E xu^^it and b = E , the result is obtained 

i 3 

that 

(4 16) EmX,ip:), X,(p,)]) = E 

Vfl V\ 

where Ei^laip,), bip,)] is the uth partial derivative with respect to X, and 
the rth partial derivative with respect to X, of Ei[X,ipi), ^ 2 (^ 2 )] evaluated 
at = o, X 2 = 6. This gives the approximate formula 

EiEi[Xiipi), Xtip,)]) Ei[aipi), 5(p2)]. 


(417) 
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4 4 Mean and variance of the number of trials required to obtain either (at 
least) hi halls in the first box, or (at least) /c 2 balls in the second box, ■ ■ , or (and 
at least) kn balls in the nth box In accordance with previous notation the mean 
number of trials required is given by Ei[]ci(pi), h{pd, ■ , hnipn)]. The exact 

value of this quantity can be written down and it ivould be a complicated multi- 
nomial expicssion The evaluation of such an expression would be extremely 
difficult, if not impossible, especially for large values of h , , ■ , /c„ , la 

order to obtain an approximation to ili[ ], repeated applications of (4.12) can 
be made and the resulting expression can be evaluated by means of the curves 
m section 6, 

For convemence, consider Ei[ki(pi), hipi), ksipf), /c4(2)4)]. The general 
result Avill then be apparent Assume that the first three boxes form a single 
unit with probability {pi + p 2 + Pa) Then the number of balls required to 
obtain either 7ci in the first, /c 2 in the second or k$ in the third, if all balls are going 
in these three boxes, is a random variable X Consequently, 

('4.18) Ei[ki{pi), • • ■ , ki{pi)] = E(Ei[X(pi + pa + pf), ^(pi)]). 

Applying (4.12), 

Eilhipi), • • ■ , hipi)] 

(4.19) 

{Pi H- + Pa), h{pi) 

Applying (4.12) once again the final approximation is 


vi \ ^ { vi y 

iPi + Pa + Pa) ' * \Pi + Pa + Ps/^ 


EilkiCpi), ■■■ , ki{p4)] 


. El El ki 


Pi 

Pi + Pa 


r. { / Pi + Pa \ 

'\Pi+P2/J\Pl + P2 H-Pa/’ 

1 + P2 + P3 jj ■ 


VPi + P2 + Pa/ 


Expression (4.20) can be translated into a course of procedure. One considers 
the first two boxes and computes 

ai = . 

L \Pi + P2/ ’ \Pi + P2/ J 


It IS then assumed that ai is a new number associated with a box with probability 
(Pi + Pa) and 


Oa — El 
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Repeating this procedure again, one computes aa = Eilchip^ + p 2 + Pa), hipi)], 
and by (4 20) this is approximately equal to Ei[kx{pi) , ■■■ , hipx)]. Tlus method 
of computation is seen to be completely general and one can apply it to any num- 
ber of boxes Each step consists of computing Ei[ ] for two boxes and con- 
sequently can be carried out with the curves of section 6 It is evident that 
the order in which the boxes are taken may have an important effect on the size 
of the error involved in using this step-by-step procedure. This problem will 
be considered in section 5. 

It is of interest to note that one can also obtain another approximation for 
Ei[kx(.pi), kiipi)] Suppose that the first two boxes are con- 

sidered as one unit and the second two boxes as another unit Then the num- 
ber of balls which must fall in the first two boxes in order to obtain either fci m 
the first box or in the second is a random variable Xi . Similarly a random 
variable Xi can be associated with the last two boxes. Accordingly 


(4.21) Ei[kx{pi), ■ ■ , ki(pi)] — E(Ex[Xx{px + Pa), Xzipi + px)]) 
By use of (4.17), (4.21) can be written as 


Eilhipi), ■■,kiipi)] 


(4,22) 




P2 


,Pi + Pa/ J 
Vi 


(Pi + Pz), 


,P3 + Pi 


(PS + P4 


]■ 


This same analysis applies directly to the factorial moments. In particular 


■Pa,i[fci(Pi) , ■ • ) ki{pi)\ csi Fi,x 


(4 23) 



Jci 



pi + P2 \ 

Pi + Pz + Pa/ ’ 

I (Pi + Pa + Pa), fcd(P4)l. 


From (4 20) and (4 23) an approximate value for o-i[fci(pi), ^(pa), kaipa), h{p^] 
can be obtained. This procedure is also perfectly general and so an estimate 
of (Ti[ ] can be obtained for any number of boxes 

This same method can be immediately applied to the approximation of 
E„[fci(pi), • , knipn)]- One simply considers the boxes two at a time, comput- 
ing E^l ] at each stage instead of Ei[ ]. 

4.5. Solution for Es[fc"] and E,[kx~^, ki]- When s is different from 1 or n, 
the complexities of the problem force one into the consideration of only the 
quantities given in the title of this subsection. The corresponding problem 
for three boxes, namely E 2 [/ci(pi), ^(pa), l^iVs)], has been treated for general 
ki and p, by McCarthy [5] However, the resulting expression is so complicated 
that it will not be given here. 

The process to be used consists of reducing the subscript s by a senes of steps 
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until the subscript 2 is reached. This expression can then be evaluated by the 
use of the curves or by simple computation. For the sake of convenience, the 
case Es[k^] will be considered in detail. It will then be possible to write down the 
expression for general s and n. 

As a starting point, look upon the first three boxes as a single unit Then 
there is a definite probability that one of these boxes will have k balls in it for 
the first time on the a:.th throw into these three boxes and that the other two 
boxes of the unit will each have less than k balls. Then if one of the other of 
the three boxes has u balls (it < k) the third box will have (x, - k - u) balls, 
(x,. — k — u <k). Meanwliile the fourth box will also have been receiving balls, 
and the number in it at this time will be denoted by (j = 0, 1, 2, •-,«). 
For each x, there is a probability associated with u, namely P{u | .r,), and another 
probability associated with j, PQ | For the moment, consider that box 
1 has received k balls, box 2 the — k — u) balls, box 3 the u balls and box 4 
the j balls. This numbering is of course immaterial since the situation is sym- 
metric with respect to the first three boxes. 

Now if j > k, either (2/c + m — .-k.) balls will be required in the second box or 
{k - u) balls in the third box in order to obtain three properly occupied boxes. 
On the other hand, if j < k, the specified number will be required in any two of 
boxes two, three and four. Consequently, with this conditional description of 
the situation, the required number of balls necessary to obtain three out of the 
four boxes occupied in the proper manner is 

(4 24) x< + ^ + Ei[{2k u — x,), {k — u), {k — j)'], 

where (h — j) will be taken as zero if j is greater than or equal to k. From this 
description, it is evident that the desired mean value may be obtained by sum- 
ming (4.24) over all possible values of x, , j and u. Therefore 


(4.25) 


Eaik*] = S xf {.T, -f X) P(j I .-r.) 

)_0 


• + X P(« I Xi)E2l{2k + u - X.), (k -u),k- jTj 


It is to be noticed that the probabilities inside the E^l ] in (4.24) and (4 25) 
do not add to one but only to 3/4. This can be easily remedied by the applica- 
tion of a formula similar to (4,6) and the result is obtained that 


(4.26) 


Falfcl = X Xi ^x. + X f*(i I 

i-o 


■ (j + 4/3 X Phi I .'r.)®2U2A: + m “ {k - u), (k - j)]^ 


where each probability inside E 2 [ ] is now 1/3. 
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By simple considerations 
(4.27) 


(*1 — k)l 


{xi — k)\ 


' u !(a;i — k — u)^ 
where u and (a;, — k — u) are both less than k, and 


(i)u 




{x^ - l)!;i 
Z jP{3 1 a:.) = a;y3, 


J:uP(u\x;) 

u Zi 


(4.28) 

From (4.27) and (4.28) 

(4.29) 

and 

(4.30) 

(4.25) can be written as 

jPij \ Xi) 

4 t 3 

(4.31) 4. ^ „ 

+ 5 2 ’T, 2^ P(j I Xt) 23 P(u I x^)E2[{2k + u- X,), (k - u), (k - j)]. 

O X i u 

Finally, making use of (4.29), (4.30), the defimtion of Xt and x, and the procedure 
of replacing random variables inside an E^l ] by their mean values, 


(4.32) E,lk*\ c^l\EAk^\ + Ei 
o 




and this in turn can be written as 

(4.33) Eilk^] ^ I |i?i[/c’] + E2 

This method of analysis which has just been applied to E^lk*] can be used 
equally well for E,[k"]. Here one simply considers the first (n — 1) boxes and 
proceeds as above The final result is immediately apparent, namely that 


(*-¥)] 


E.\k^] 


(4.34) 


n — 1 




It will be noticed that in reducing (4.34) further it will be necessary to consider 
expressions of the form E,[ki~\ fe]. However, it will be seen from the foregoing 
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analysis that no use was made of the fact that the integers attached to the first 
[n — 1) boxes were the same, Accordingly, 


EXhr , fe] ^ 


n 


(4.35) 


■ 1 , 
E«-i 


iEiikr'-] + 


- 1 r, 

1 

St 

7 


1 0 

L\n — 2 

- n-2} 

f'- »-I)J) 


Now, by the use of (4,34) and (4,35), it is possible to reduce s as much as may be 
desired. 


6. Some considerations concerning the error of the approximations. 

5 1. Preliminary remarks This discussion of the errors of the approximations 
given in the preceding sections has been left until now so that a broad perspec- 
tive might be gamed, and the errors seen in relationship to one another. Such 
an arrangement is advantageous in this instance since both the analytical and 
computational results bearing on the subject are scanty, and consequently, 
any intelligent leads which their inter-rclationships can give are most helpful. 

The difficulty involved in obtaining exact values for the various quantities 
considered in this paper has been pointed out qmte frequently, and the approxi- 
mations have been devised to overcome this very difficulty The same com- 
plexity which prevents the computation of many exact values also prevents any 
effective analytic approach to the problem of evaluating the errors. For these 
reasons the author has been unable to carry through any general analytic treat- 
ment of the errors of the approximations. However, because the intelligent use 
of approximations requires some knowledge of their accuracy, certain isolated 
cases have been investigated by a combination of computational, graphical and 
analytic methods These investigations are detailed m the remainder of this 
section, and conjectures concerning the general behavior of the errors are made 
whenever possible. As has been stated earlier, no consideration will be given 
to the approximation formulae for the variance 
5.2 Errors of the approximations for Ei[ki(pi), • ■ ■ , knipn)] and 

En[fci(pi), • • • , k„{pn)]. 

Taking n equal to 3, we have from (4.11) that 

I Fifti(pi), hipf), kaips)] - Ei[a(pi + pf), kaipa)] \ 

_ \Pi + Pt/ \pi + Pi/ J 

■ Max I El[X(pi + Pi), h{pa)] 1 , 

where Max | jBi[Z(pi + pi), A: 3 (p 3 )] | is the maximum absolute value of the 
second derivative of Ei[Xipi -f- pi), ^ 3 (^ 3 )] with respect to X, and a is equal to 
Pi[kiipi/ipi + Pi)), hipi/ipi + P 2 ))]. Now an examination of the curves 


(5.1) < i 
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given in section 6 indicates that, for fixed ps and fcg , the maximum curvature of 
EilXipi + P 2 ), hips)], considered as a function of X, is a monotone decreasing 
function of h Since this curvature is negative, this geometric observation is 
equivalent to 


(5.2) 


Max|^5[X(pi + pi), (h + 1)(P3)] I 

< Max I EllXipi + Pi), hips)] I, 


although it is not necessarily true that 


E\[xyipi + Pa), ih + l)(p3)] I < I £?iti:i(pi + Pa), /c3(pj)] 


IMoreover, 


(5 3) Eilhipi), hips), hips)] < Eilhipi), hipi), ih + l)(p 3 )]. 

From (6 1), (5.2) and (5 3) one readily obtains that the absolute value of the 
percentage error of the approximation to Ey[ki{pi), hips), hiPs)] is bounded by 
a function, say Ui[h(pi), h(pi), ^ 3 (^ 3 )], which is a monotone decreasing function 
of h as h increases. It should be noticed that the results of 4 2 have already 
shown not only that this upper bound for the percentage error approaches zero 
as *3 becomes infinite, but also that the absolute difference between the true and 
approximate values approach zero as h becomes infimte. 

Computation of Uilhipi), hips), hips)] is very time consuimng because of the 
difficulty in obtaining Max | Ei[Xipi + ps), hips)] 1 , and because the direct 
computation of Eilhipi), hips), ^. 3 ( 313 )] is laborious when any of h , h and h 
are much larger than 2 or 3. In order to surmount these difficulties and still 
give some indication of the behavior of (7i[Aii(pi), hipi), hips)], the following 
expedients were adopted; 

1 The values of h , h and h were each fixed at 5, 

2. Max I El[Xipy + Pi), hipa)] | was obtained by graphical means, namely 
drawing the slopes of the appropriate curve in section 6, graphing these slopes 
and then taking off the maximum slopes of these curves. 

3. EilhiPi), hips), hips)] was replaced by its approximation, 

Eilaipy + Ps), hipa)], 

in the computation of the percentage error. This new bound will be denoted 
by Uilhipi), hips), hips)] 

Jf. Carefully chosen values of Ut[hiPi), hiPs), hipa)] were plotted on trian- 
gular coordinates, and contour lines interpolated and extrapolated to cover in 
large part the range of pi , ps and ps 

The use of the third of the above listed assumptions is no detriment to the 
usefulness of the results since 


Ey.[] - 

^^l[] 


1 


EyA]-Ey[] 

ElaU ^ udhjpi), hips), hips)] 

ElaU - ElU ~ ^(^0 - Udhipi), hips), kaips)]’ 

W] 
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,1 = +/^)> E,[ ] = EAUvd, Uiv.), fcafe)] 

bince C/i [ ] is a monotone decrease function of h , this new bound on the per 

centage error is also monotone decreasing for increasing , Absolute values 
were not required in this derivation since Eu[ ] is always greater than or equal 
to Ei[ ], as is apparent from (5.1) and an examination of the curves of secLn 
6. The contours of {7i [5(pi), 5 (^ 2 ), 5 (^ 3 )] are shown in Fig, 1 The interpreta- 
tion of this figure is very straightforward For example, for ps . 5 , the value' 



Fib 1. Contours of l/flSfpi), Sfps), 5(jJa)] Considered as a Function 
OP Pi, Pa and Pa 


of_^17r[5(pi), 5 (pj), 5(pb)] is less than 5.0% Making use of the definition of 
Ui[ ], and especially its monotone characteristic, one can then say: the ap- 
proximation for Fi[5(pi), 6 (p 2 ), Up,}], where h > 6 , pa ^ .50 is in error by not 
more than 5.3% Moreover, as has been already observed Fi[a(pi -f- p,), 
h{pi)] is always greater than or equal to h{pi), ifc 3 (p 3 )] 

It will be noticed from Fig, 1 that 17i‘[ ] is increasing steadily as ps approaches 
1. It has been demonstrated by McCarthy [5] that this behavior of the upper 
bound does not mean that the percentage error itself becomes larger as ps ap- 
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proaches 1 . As a matter ol fact, for fixed ki , Ih and ks , the percentage error 
approaches zero as p3 approaches 1. However, this demonstration does not 
fimnish any reasonable bounds with which to fill in the lower left band corner of 
Fig 1 This fact is not as serious as it may at first seem because there is nothing 
to prevent one from reordering the boxes For example, consider Si[5(2), 
5(.2), 5(6)], From Fig. 1, the error of the approximation for this quantity, 
namely Ei[Ei[5{ 5), 5( 5)](.4), 5( 6)], is not more than approximately 

7 5/(100 - 7.5) = 8 1%. 

On the othei hand this same figure shows that Fi[Fi[5( 25), 5( 75)]( 80), 5(,20)], 
which is also an approximation to Fi[5(.2), 5( 2), 5(.6)], is in error by not more 
than approximately 8% Consequently one would choose the second ordenng. 

The procedure which has been used to obtain an upper bound on the percent- 
age eiror of the approximation to Ei[ki{p]), hijh), hiva)], nnd fe fixed and 
ks greater than or equal to that integer at which the bound is evaluated, can also 
be applied to Esikiipi), h(pt), fe 3 (pa)] All the assumptions remain the same 
and in this case the bounds corresponding to ?7i[ ] and Ut[ ] are denoted by 
HsI ] and [/H ]• As in the case of (7i[ ], we have 

^g[J jSab[l 

EiU - EzbU _ ® 86 [] ^ ut[ki(pi), hipa), ksjpi)] 

Ea[] , , E,[] - ~ 100 

EM 

Here the approximation, E 2 [b{pi + pi), /caCpa)], is always less than or equal to 
the exact value, Eilhipi), ^(ps), hipa)]. The contours of Ut[5{pi), 5(p2), SCpa)] 
aie shown in Fig 2 In using U*[5(p,), 5(p2), 5(pj}] it is sometimes advan- 
tageous to reorder the boxes. For example, consider F3[5(2), 5(2), 5(.6)]. 
Fig 2 shows that, as an approximation, E2[E2[5i 5), 5( 5)]( 4), 5( 6)] is in error 
by not more than approximately 9% However, E2[E2[5{25), 5(,75)](.80), 
5( 20)], which is also an approximation for Esl^i 2), 5(.2), 5(.6)], is in error by 
not more than about 7%. There is a gam here, but it is not as great as the cor- 
responding situation for Fi[5(,2), 5(2), 5(6)] 

As has already been stated, one may minimize the error by correctly choosing 
the two boxes which are to be combined first. Some discussion will be given 
here of a procedure for choosing these two boxes Of course an expeiimental 
scheme may be used which makes use of the fact that the approximation to 
Ei[ki{pi ) , Ihipi) , kiipa) ] is always an overestimate. In other words, that grouping 
is used which gives rise to the smallest value of the approximation. However, 
this can be replaced by a few preliminary computations 
As can be seen from (5 1) , the error of the approximation depends upon two 
quantities, namely the variance of the two box situation obtained by combining 
two of the boxes, and the maximum value of the second derivative of the curve 
representing the function Ei[X{pi pf), ka{pa)] over the proper range of X 
values. The error will bo zero of Ei[X(pi -|- P 2 ), ka(pa)] is either a constant or 
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linear in X over the range of X values in which one is interested, that is < 
-S' < fci + fca - 1, /ci ^ hx . If this is not possible, then one wishes to makeTt 
asjnear so as possible, subject to the restriction that 

'Aihivi/ivy + Pi)), h(p2/(pi + P 2 ))] 

is not unnecessarily large. 



15 , 0?6 10 , 05 & 


Fig 2 , Oontovus or C/*[ 0 (Pi), 0(302), Sfpa)] Considebed as a I'unction 

OP Pi, pi AND ps 

An indication of the relationship between the boxes for both linearity and con- 
tribution to variance can be obtained from expressions (3.10) and (3.11). Thus 
for each box one computes k,/p{ and A;,(l — pi)/p^,. Then in order to most nearly 
achieve linearity one orders the boxes in accordance with the increasing order of 
k,/p, and combines them in that order. If there is a tic between two or more 
boxes with respect to the Ict/p, ordering, then one orders these “tied” boxes in 
accordance with increasing )c,(l — pi)/pt 
Some computations have been carried out to illustrate these points and they 
are given in Table 1. Themotation ((2, 4), 6) means that one first combines the 
boxes with integers 2 and 4, and then combines this result with the box with 
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associated integer 6. All values in this table were obtained by direct computa- 
tion No use of the curves was made. 

In these three situations, one obtains the values given in Table 2 
Thus in the first case there is nothing to choose with respect to /c,/p» , but 
7cj(l — Pi)/Pi indicates the ordering ((6, 4), 2). Actually the percentage error 
in this instance is 1.0 as compared with 1 7 and 2 4 for the other two orderings. 
In case two, indicates the ordering ((2, 6), 4) Although this does not 
turn out to be the best ordering, Table 1 shows that the ordering m this instance 
makes little difference. In the last case, the indicated ordering is ((2, 4), 6) 
and the percentage error for this is zero, as opposed to 1.3 and 1.6. Since at 
any stage in the operation of combining boxes two at a time (4.13) holds, the 

TABLE 1 


Effect of Order of Combination on Error of Approximation 


Pi 

Vb 

ki 

Pi 

yi 

Jc2 

1 

Pa 1 


% Error of Approximation 

Order of Combination 
((2, 4), 6) ((2, 6), 4) ((4, 6), 2) 

2 

4 

6 

'6.96 

+ 1.7 

+2.4 

+ 1.0 

4 

6 

2 

3.92 

+0.3 

+0.5 

+0.5 

6 

4 

2 

3.77 

+0.0 

+1.3 

+ 1.6 


TABLE 2 


Vi 

1/6 

1/3 

1/2 

1/6 

1/3 

1/2 

, 1/6 

1/3 

1/2 

hi 

2 

4 

6 

4 

6 

2 

6 

4 

2 

h/p. 

12 

12 

12 

24 

18 

4 

36 

12 

4 

7c,(l - p,)/p. 

GO 

1 

24 

12 

120 

36 

4 

180 

24 

4 


above procedure will also give the mimmum error for the approximation to 
EsifciCpi), kiipf), A; 3 (p 3 )] Moreover, the approximation for this quantity -is always 
an underestimate of the true value, and therefore that ordering should be taken which 
gives the greatest value for the approximation. 

When the error of the approximation to Eilkffpf), • , and 

• • • , Kipn)], 

for n greater than three, is considered, it is immediately obvious that the general 
considerations already given in this section still apply In addition to these 
considerations, there is the difficulty that errors may cumulate. However, the 
results already quoted for tliree boxes, in conjunction with those which are to 
be given in 5.3, indicate that this cumulation is not serious There are two 
factors which eventually prevent (i e. as more and more boxes are considered) 
this percentage error from becoming unduly large, and, in fact, make it approach 
zero. These are: 

J. The value of pa will, in most instances, be decreasing as more and more 
boxes are considered (see Fig 1), and 
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The true value is usually becoming larger and larger as more and more boxes 
are considered 

In order to minimize the error, the following precautions should be taken' 
i. At each stage in the computation, try to avoid, as much as possible, maldng 
readings where EilX(pi + p's), hip^)] is curving sharply. If all leadings are 
made where the curves are nearly linear, the percentage error will be very dose 
to zero, On the other hand, if many readings must be made wheie the slopes 
of the curves are changing most sharply, larger errors must be expected 
2 Use that ordering of the boxes which provides the minimum value for the 
approximation to Ei[ ] or the maximum value for the approximation to ]. 

S In order to approximate the ordermg which (2) would give, compute 
kjp^ and h.(l — p^)/pl at each stage at rvhich two boxes are to be combined 
and use the rules of procedure already given for three boxes. 

5.3, Error of the approximakon for Repeated applications of the re- 

duction formulae (4,34) and (4 35) allow one to evaluate SJ/c"] by means of the 
solution for the two box case, or more explicitly, by means of the curves given in 
section 6. Here the error of this approximation ivill be discussed primarily from 
a coniputational point of view. 

can be treated in detail since it is possible to obtain exact values for this 
expression by means of (4.1). This has been done by McCarthy [5], but the 
details will not be repeated here because of lack of space. The results simply 
add more credence to the conjectures which will soon be made. 

When k is taken to be larger than oile, the difficulty arises that it is almost im- 
possible to compute the exact value of E,[k''] in a large number of cases Con- 
sequently it was necessary to devise aji expenmental model to estimate these 
exact values so that the amount of error would be known within bounds. A 
set of 10,000 punched cards' was obtained on which were recorded 100,000 
random numbers drawn from a rectangular distribution. Thus if the cards are 
ordered on a particular set of columns, and one reads off the digits 0-9 on another 
specified column, one card at a time, it is equivalent to using a table of random 
numbers such as those prepared by Tippett [7]. By the use of these cards, it' 
was possible to run off on an IBM Tabulator any desired number of experiments 
in order to obtain an experimental distribution from which to calculate an es- 
timate of E,[h^] and the variance of this estimate For example, in determining 
an estimate of Ei[2^] one hundred experimental trials were made, as described 
above, with the following results: 


Number of Trials 
Hoquirecl 

2 

3 

4 

5 

6 

^ These punched cards were prepared at 
direction of Doctor Joseph Berkson. 


Frequency 

23 

32 

31 

11 

3 

Mayo Clinic, Rochester, Minn , under tie 
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From tliis distribution the estimate of Bi[2^ is 3.39, with a variance computed 
from the distribution of Oil The 95% symmetnc confidence limits for the 
mean, computed from the Student i-distribution, are 3 17 and 3 61 Such 
estimates wiU be used in the remainder of this section It should be pointed 
out that in order to prevent a prohibitive amount of machine time, it was 


TABLE 3 

Percentage Errors for Efk”] 


B k 

n 3 

4 

5 

1 1 

— 

_ 


2 

+ .7 

+ 2.2 

- .3 +13.6 

5 

+ 1 1 

- 3.1 +5.7 

+ 6 +10.7 

10 



- 2.9 + 5.1 

2 1 

- 5.6 

- 4 

+ 1.3 

2 

- 4 6 

- 4 4 +4.4 

+ .6 +10 4 

5 

-4 6 +1.7 

+ 3.0 +9.3 

+ 7 9 +14.8 

10 

-3.7 +2.1 

- .3 +5.5 

+ 4.3 +10 7 

15 


+ 1.0 +7.2 


20 

-2.5 +2.4 



3 1 

-18.2 

-12.7 

- 3 1 

2 

- 6.3 

-16.5 -7.3 

- 2,9 +60 

5 

-9.7 -2.2 

-10.7 -5.5 

+ ,8 + 5.8 

10 


1 

- 2 1 +3,1 

4 1 

1 

-12.0 

-15.6 

2 


-13.6 +6.1 

-11 6 - 3 9 

5 


-13.9 -7.2 

- 9.9 - 4,0 

10 


- 8 9 -2.6 

- 6.4 - 1.2 

5 1 



-8.8 

2 



-18.1 - 6.0 

5 



-12.5 - 5 6 

10 



- 8.9 - 2 9 


necessary to use many of the same runs to determine values of EffP] for different 
values of s, k and n This means that the errors are correlated to some slight 
extent, but it would be extremely difficult to determine how much. 

A summary of the computed percentage errors for various values oi s,h and 
n is given in Table 3. In the instances where there are two entries, they are 
calculated on the basis of the 95% confidence limits for the experimental mean 
These confidence limits are symmetric and were determined by using the Student 
t-distribution. For k equal to 2 and 5 the distribution contained 100 trials, 
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while for k greater than 5, the distribution were made up of approximately 60 
trials. 

The computations given in this table show for various values of s, k and n 
the percentage error of the approximation for E,[k"] In addition to showing 
the values of these percentage errors, the computations lead one to conjecture 
that 

1. For fixed s and k, there exists an no such that for n > no the absolute value 
of the percentage error of the approximation for Eslk"] is a monotone decreasing 
function for increasing n It was shown by McCarthy [S] that tins absolute 
value approaches zero as n approaches infinity for £?«[!”], and in fact, that the 
difference between the true and approximate values approaches zero 

S For fixed s and n, there exists a ka such that for k > ka , the absolute value 
of the percentage error of the approximation for is a monotone decreasing 
function for increasing k. 

6. Computation. 

6.1. Curves to aid in the compufahon of Et[ki(pi), hlpf)] In 3.1 it was shown 
that Bi[ki(pi), /caCpa)] is equal to 

hAi + 1, fe) + - + 1, h), 

Vi Pi 

Avhere J*(p, q) is the Incomplete Beta-Function as tabled by Karl Pearson [6], 
There are three principal difficulties connected with the use of these tables as 
they apply to the approximations of this paper. These are: 

1. The tables must be available, 

S. The tables give directly only values for integer or half-integer values of 
Jci and fca , and 

S Since many different values of £'i[Ai(pi), hipi)] are often required to obtain 
a single approximation, the computational burden would be very heavy. 

In order to surmount these difficulties, it seemed advisable to prepare curves 
giving the values of Ei[ki{pi), hi{pf)] for vaiious values of /ci , ki , pi and pa . 
These curves would give values of Ei[ ] with sufficient accuracy for most prob- 
lems not only for integer values of fci and h , but for all values over the range 
considered. 

Such curves have been prepared by computing Ei[ki(pi), /caCpa)] for integral 
values of h and fca (for fixed pi and pi) and then joining these points with a 
smooth curve. A summary of the graphs prepared is as follows: 


9 

h 


kt 


Pi 

Pt 

Fig 3 

1,2,. . 

to 

8 

1,2,... 

, 35 

.50 

50 

Fig. 4 

1, 2, . . . 

20, =0 

1,2,... 

, 35 

.40 

60 

Fig. 5 

1, 2, • 

16, 00 

1, 2, • - • 

, 35 

.20 

.80 

Fig. 6 

1,2, ■■■ 

10, oo 

l, 2, • • • 

, 15 

.80 

20 

Fig. 7 

1,2, ... 

7, oo 

1, 2, 

, 16 

60 

.40 

Fig. 8 

1, 2, . • . 

8, 00 

1,2,. • 

,15 

.50 

.50 

Fig. 9 

1, 2, • • . 

6, oo 

1,2,..- 

, 15 

.40 

.60 

Fig. 10 

1, 2, • . . 

, 5, CO 

1, 2, ■ ■ • 

, 15 

.20 

.80 
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Figures 8, 9, and 10 are simply portions of figures 3, 4 and 5 drawn on an ex- 
panded scale in order to permit greater accuracy in reading the curves. Also 
figures 6 and 10 and figures 7 and 9 form pairs in that a member of one pair can 
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be obtained fiom the other member of the pair Both members of the pair are 
given on the expanded scale in order to facilitate interpolation. Values of the 
mean for combinations of / i and As not given directly can usually be obtained 
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extremely poor. 
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As an example, suppose one has two boxes with hi = 2, = 5, pi = .40 and 

P 2 = 'GO. Consulting Fig. 9, one goes along the horizontal axis to = 5. 



Following up the vertical line through this point to the curve h = 2, ^i[2(,40)f, 
5(.60)] is read as 4 25. The actually computed value to four decimals is 4.2224, 
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It][is immediately evident that i? 2 [^i( 2 >i)j feCpa)] can also be obtained from the 
curves since 

E2[hipi), k2{p2)] = {h/pd + (.h/P2) - E.lhipi), UP2)] 

6.2. Use of the curves to oUmn exact values Qi e subject only to the error of reading 
the curves) for Ei[ki(pi), k^^pi), A-sCps)] Referring back to (4 8), one obtains 
that 

(6.1) Ei[kt{pi), hip^, fc.i(p3)] = ir,EiMpi + pi), kiipi)], 




378 


PHILIP J. MCCARTHY 



where Vi is the probability that either ki balls are obtained in the first box or fe 
balls are obtained in the second box on the Xi th throw for the first time, assuming 
halls can go only in boxes one and two. takes on values 

fci , fci + 1, • • • , Ifci -fr — 1 

when ki < ki . Now tt, can be easily computed and Eilxtifi + pa), A:3(p3)1 
can be obtained from the curves. The only difficulty in using this procedure 
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arises when the range of Xi is large. Then a large amount of computation is 
involved. 

In order to illustrate this computatron, consider i5i[2(.l), 3(.l), 6( 8)], Here 
s. takes on the values 2, 3 and 4. We have xi = 2, in = 2/8, % = 3, ttj = 3/8; 
and xj = 3/8. From Fig. 6 

, Hi[2(.2), 5(.8)] = 5.09 
Hi[3(.2), 5( 8)] = 5.88 
Hi[4(2).5(.8)] = 6.11. 
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0 I 2 3 4 5 6 7 6 9 10 J I 12 13 14 


Consequently, Ei[2(.l), 3(.l), 5(.8)] is equal to 

(5.09)(2/8) + (5.88)(3/8) + (6.11)(3/8) = 5.77. 

Using computed values for 5(.8)], Ei[2{ 1), 3(.l), 5(8)] is equal to 

5.75. Thus the use of the curves has only led to an error of 3%. 
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6.3. Use of the curves in a-pproximahng Ei[ki(pj), • • • , fc„(p„)], 

EnihiPi), ■ ■ ■ , hn{p„)] 


and Es[k^]- In illustrating the application of the curves and the reduction 
formulae (4.34) and (4 35), one example will be worked through in detail. This 
example will provide illustrations of all the details involved in such problems. 
Consider Sip']. Applying formula (4 34) 


(6 2 ) 


Ei[5^\ 5/4 


EAf>^] + E 





Consequently, the first step must be to compute £!i[5^]. Using the principles of 
4.4 

(6.3) Ei[5'] c- Ui[Bi[5"](.50), 5(.25), 5(.25)] 


From Fig. 8, .Bi[5’“] = 7.55. Therefore Fli[5^] is approximately equal to 


Ui[7.55( 50), 5(.25), 5(.25)]. 


Now applying the same principle again, 

(6.4) Ui[5‘l E,m7 55(f), 5(|)](.75), 5(.25)]. 

By the use of figures 7, 8, 9 and 10, graphical interpolation may be applied to 
find that Ui[7.55(f), 5(f)] is equal to 9.84 The approximation procedure now 
says that 

(6.5) ES^] ^ .Ei[9.84(.75), 5(.25)]. 


Again applying the curves and using graphical interpolation for pi and Pi , 
Ei[5*] ~ 1188. 

Substituting this value m (6.2) , 

(6 6) UiP®] i {11 88 -f E3[2.71, 2 71, 2.71, 2 03]] . 

Now formula (4.35) must be applied to Bjp?!, 2.71, 2 71, 2.03], i.e. 


Ua[2.71, 2.71, 2.71, 203] 


(6.7) ' 


Si[(2.7J)’] + E, 


2 71 - 


fii[(2.71)’ 



E,[i2.ny] 

3 



Fi[(2.71)*] can be evaluated by the same method used for Ui[5*]. This leads to 
the result 


(6.8) Fa[2.71. 2.71, 2 71, 2.03] ^ f {4.40 + EjII.SO, 1.86, .56]). 
Once more applying (4 35) 

Fill .86, 1.86, 56] ^ 

(6.9) 


Ei[(l,86)“] + Ei[(2- 1.86 - E^[(.1MY\), (^.56 - 
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£i[1.86, 1.86] is equal, by the curves, to 2.25. Therefore 

(6.10) 1.86, .66] § {2.25 + JSi[1.4:7, - .66]}. 

However, sinee the convention is observed that a negative quantity is replaced 
by zero, 

(6.11) £^i[1.47, - .56] = i?i[1.47, 0] = 0. 

Now working back through these various expressions, 

(6.12) Ei[5^] ~ f [11.88 + I [4.40 + i [2 25 + 0]]] = 27.81. 

From Table 2 it can be seen that the percentage errors for this approximation 
to Ei[5^], corresponding to the 95% confidence limits for this quantity, are —4.0% 
and —9.9%. 

This example has illustrated most of the situations which will arise in the use 
of the approximations of this paper. 

6.4. Miscellaneous approximation formulae useful for computation. There 
exists a relatively simple approximation to .Ei[^;:(pi), hipf)], pi + pa = 1, when 
Pa is near one. Using (3.8) and making some obvious simplifications, one ob- 
tains 


Ejlhipi), ^(pa)] — + 

Pi 


1 (fei + h ) ! 1 p 

pa (h - 1) Kh - 1) ! Pi lo 


jka-l(l _ 


Since pi is near zero, (1 — t) can be replaced by one, and the result is obtained 
that 


iS'ife(pi), hipi)] ^ - 
Vi 


1 ii {ki -|- /Ca) 1 

Vi (^•i+l)!(/f2- 1)1 


An approximation to the Incomplete Beta-Function, given by Tukey and 
Scheff4 [8], may also prove useful at times. The expression, changed slightly 
by those authors since publication, is 


- , + 1, ^ 1 _ ^ f “ (l) 

where 


a 

Xa 


2r 


(1 _ 5) !L±i 

r 

■\/b 


1 


+ 2r. 


The right hand side of the first expression will be recognized as the distribu- 
tion with 2r degrees of freedom. In the event that the tables of S'f® not ade- 
quate for the application of these expressions, the approximation of Wilson and 
Hilferty [10] should be used, This approximation states that (xV>')* where 
V is the number of degrees of freedom, is approximately normally distributed 
with mean 1 — 2/{9v) and variance 2/{9p), for large v. 
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THE DISTRIBUTION OF THE RANGE^ 

By E. J, Gumbel 
Brooklyn College, N. Y. 

1. Sutamary. The asymptotic distribution of the range w for a large sample 
taken from an initial unhmited distribution possessing all moments is obtained 
by the convolution of the asymptotic distribution of the two extremes. Let a 
and u be the parameters of the distribution of the extremes for a symmetrical 
variate, and let R = a{w—2u) be the reduced range. Then its asymptotic 
probabihty F(i£) and its asymptotic distribution \piR) may be expressed by the 
Hankel function of order one and zero. A table is given m the text. 

The asymptotic distribution g{w) of the range proper is obtained from \p{R) 
by the usual linear transformation. The initial distribution and the sample 
size influence the position and the shape of the distribution of the range in the 
same way as they influence the distribution of the largest value. If we take the 
parameters from the calculated means and standard deviations, the asymptotic 
distribution of the range gives a good fit to the calculated distributions for normal 
samples from size 0 onward Consequently the distribution of the range for 
normal samples of any size larger than 6 may be obtained frorn the asymptotic 
distribution of the reduced range. 

The asymptotic probabilities and the asymptotic distributions of the mih. 
range and of the range for asymmetrical distributions are obtained by the same 
method and lead to integrals which may be evaluated by numerical methods. 

2. Introduction. For any initial distribution, and any sample size n, the dis- 
tribution of the range may easily be written down in the form of an integral. 
However, for many given initial distributions the integration can be carried out — 
if at all — only for very small sample sizes, say n = 2 or n = 3 For larger 
samples, complicated numerical calculations have to be made, and there is no 
way of obtaining the distribution for m + 1 observations from the distribution 
for n observations. 

Our object is to obtain the asymytoUc distribution of the range. Nothing is 
supposed to be known about the initial distribution, except that it is of the ex- 
ponential type [9] which assures that it is unlimited in both directions, and pos- 
sesses all moments. It will be shown that this condition is sufficient for the 
existence of an asymptotic distribution of the range 
With increasing samples sizes the distribution of the range may approach its 
asymptotic foim in a quick, or in a slow way This behavior depends upon the 
nature of the initial distribution. Two examples for this approach will be 
shown. 


^ Research done with the support of a grant fiom the Social Science Research Council 
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3. Th6 exact distribution of the range. Let <p(x) be any initial distribu- 
tion, $(a-') the probability of a value equal to, or less than, x. Then, for samples 
of size n, the joint distribution iDnCaJi , *») of the smallest value Xi and the largest 
value a;„ is 

(1) , a:„) = n{n - l)<?)(a;i) ($(a;„) - $(a;i))"~V(a:n). 

The distribution of the range defined by 

( 2 ) Xn = Xi + w„ 

is obtained by integrating over all values Xi S. whence 

(3) gn(wn) = n{n ~ 1) J (^(.x -f w„) — <p{x + w^tp{x) dx, 

where the index 1 has been dropped The probability Gniwn) for the range to 
be equal to, or less than, Wn is obtained by integration of (3), whence, by re- 
versing the order of integration, 

Gniwn) =n / (n — !)($(* -f Wn) — dHx + Wn) d^ix), 

J— 00 “0 

or, after integration, 

On{wn) = n [ (^{x -b Wn) — df>, 

a formula to which Prof. H. Hotelling has drawn my attention The beauty of 
this formula is completely marred by the facts that, in general, we cannot express 
$(» -)- Wn) by #(^), and that the numeiical integration is lengthy and tiresome. 

The problem of the range for the normal distribution was first raised twenty 
five years ago by L. von Bortkiewicz [1, 2] For n = 2 and n = 3 the distribu- 
tion of the normal range may be written down explicitly [12, 13] For larger 
normal samples up to n = 20, E S. Peai’son [16] and H. 0. Plartley [10] have 
calculated numerical tables of the probability of the range. LHC. Tippett 
[20] has calculated the mean, the standard deviation, and the moment quotients 
for the range of the normal distribution up to n = 1000 He gave formulae for 
the moments in the form of integrals. Finally “Student” [18] reproduced the 
distribution of the range for small samples, tj = 2, 3, 4, 5, 6, 10, by Pearson’s 
type I, and gave a formula for large samples n = 20, 60, based on Pearson’s type 
VI, a procedure which is purely empirical and, therefore, unsatisfactory for 
theoretical purposes. A good resumfi of the present knowledge about the 
range is given in Karl Pearson’s Tables [17] 

All these studies are confined to the normal distribution and allow no conclu- 
sion about the asymptotic distribution of the range. According to Kendall [11] 
it is not known whether such forms exist and what they are. This question may 
at once be answered for a special case. If the distribution is limited to the left 
(or to the right), the asymptotic distribution of the range is equal to the asymp- 
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totic distribution of the largest (smallest) value. The asymptotic distribution 
of the range exists provided that an asymptotic distribution of the largest 
(smallest) value exists For the exponential distribution, and for initial dis- 
tributions of the Pareto type, for example, the asymptotic distribution of the 
range is equal to the asymptotic distribution of the largest value. The asymp- 
totic distribution of the range for the rectangular distribution has been derived 
by A. G. Carlton [3]. 

4. The asymptotic distribution of the reduced range for a symmetrical 
variate. Instead of the procedures mentioned in the last paragraph, let us 
consider a large sample It is generally assumed that the smallest and the 
largest values are independent in that case L. H. C. Tippett [20] has shown 
that the correlation between the extremes is negligible for the normal distribution 
and for sample sizes n ^ 200. In a previous note [9] it has been shown that 
independence holds for large samples and for initial distributions of the ex- 
ponential type unlimited in both directions and possessing all moments. Then 
the joint distribution (1) splits into the product of the asymptotic distribution 
fi{xi) of the smallest value Xi and the asymptotic distribution /„(s:„) of the largest 
value Xn 

(4) m(j:i , Xn) = /i(a;i) •/„(a:„), 

If, furthermore, the initial distribution is symmetrical about zero, the two 
asymptotic distributions are 

(5) /i(»i) = a exp[a!(®i -f- «) - ; /„(x„) =Q:exp[- a{xn—u) - e"*'*”""']. 

These asymptotic distributions and the corresponding probabilities are traced, 
in a reduced scale, on Graphs (1) and (2) 

Since the two parameters u and a will exist also in the asymptotic distribution 
of the range, their nature must briefly be explained. The value u is defined as 
the solution of 

(6) $(w) = 1 - - . 

n 

Since 

(6') w(l — $(«)) = 1, 

the largest value u may be called the expected largest value. It differs, of course, 
from the mean of the largest value. It has been shown [6] that u increases as 
a function of the logarithm of n, the function depending upon the initial dis- 
tribution, 

Criteria for the approach of the distribution of the largest value toward its 
aS 5 miptotio form have been given by R, A. Fisher and L. H. C. Tippett [4]. , 
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For our purpose it is sufficient to consider whether n is so large that u is very 
near to the most probable largest value obtained from 


(7) 

If 





XnP^U 

holds with suSicient approximation, 2u may be interpreted as the range of the 
modes for an initial symmetrical distnbution. 

The parameter a defined by 


( 8 ) 


^ <piu) 

~ 1 - ^{u) 


also IS a function of n. Three cases have to be distinguished: In the first case, a 
IS a constant, or converges with n toward a constant different from zero. In the 
second (and third) case, a increases with n without limit (decreases with n 
toward zero). The three cases correspond to three classes of imtial distributions 
of the exponential type. The function a is related to the asymptotic standard 
error of the largest, and of the smallest value by 


2 2 2 2 TT 

a cr„ = a <ri = — 
6 


If a increases (decreases) with n, or is independent of n, the standard error of 
the largest value decreases (increases) with the sample size, or is independent 
of it. This behavior has nothing to do with the fact that the standard error of 
the mean decreases, of course, vnth an increasing number of samples. 

The determination of the constants u and a from equations (6), (7), (8) is 
based on the knowledge of the initial distnbution and the sample size n from 
which we take the largest observation. This method cannot be used in many 
practical applications: 1) It may happen that the imtial distribution, or the 
parameters it contains, are unknown. Therefore the parameters of the largest 
value cannot be obtained from it 2) The imtial distribution might be known, 
but the number of observations is insufficient to warrant thfs procedure, because 
the most probable largest value Xn differs from the expected value u. In these 
cases the parameters u and a have to be estimated from the observed distribution 
of the largest value alone. A similar procedure will be used for the range in 
paragraph 7. ' 

From (4) and (5) the joint asymptotic distribution li)(a;, w) of the smallest 
value a;i and the range w becomes 

m(a:i, w) = c?exp\-a{w - 2u) - “>]. 

The asymptotic distnbution g{w) of the range alone is, dropping the index 1, 

(40 g{w) = f ^*exp[-e“<^> -6"“'“^-“’] da:. 

J—ao 
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This distribution contains the two parameters a and u existing in the asymptotic 
distribution of the largest value To eliminate the two parameters, a reduced 
range R is introduced by 

(10) B = ot(w — 2u). 

The range w is a positive variate unlimited toward the right. The reduced 
range R is also unlimited toward the right yet limited toward the left by 

(10') R 6 -2au. 

The reduced range is not related to one of the averages of the range. It is the 
rahge minus the range of the modes divided by a factor which is proportional to 
the standard error of the extreme value. The distribution i'(R) of the reduced 
range R, and the distribution fif(w) of the range w are related by 

( 11 ) = ^giw), 

a 

subject to restriction (10'), whereas the probability '^{R) of the reduced range to 
be equal to, or less than R is equal to the corresponding expression G{w) for the 
range proper 

(11') MR) = Giw). 

For the integration in (4') we put 

a(,x -f- 1C — «) = —y 


whence, from (10), 

a{x + li) = —y — R- 

The asymptotic distribution of the reduced range becomes 

(12) \piR) = [ exp[— e'' — dy 

and the asymptotic probability ’5'(fB) of the range is 

(13) ^{R) = f exp[j/ — e" — e""”''] dy 

J-_g0 


an expression which may easily be verified by differentiation. 

The asymptotic formulas (12) and (13) hold for any initial symmetrical dis- 
tribution of the exponential type, for example, for the normal and the logistic 
distribution (see par 7). The mean reduced range R and the higher moments 
of the reduced range are easily obtained from the mean W, the variance ah , and, 
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the invariants \ of order v of the range proper w given m a previous paper [8]. 


They are 



(14) 

w = 2u -t- ; ffl 

2 

_ TT 


a 


(15) 

\ _ 2(v — 1) ! 1 

' af h it- ’ 

V "^2 


where 7 stands for Euler’s constant. 

Consequently the mean B, the variance ff% and the invariants Xi. of the reduced 
range are 

(16) E = 27; ^ ; X. = 2(r - 1) 1 J i ; v^2 

6 Hal A*' 

Equation (14) leads to an interpretation of the reduction (10) which may be 
written 


R = a{w — id) + 27 


or 


(140 


R = 


cr« 


Thus the transformation (10) is a linear function of the standard transformation 
(u) — w) /<Tw usual in statistics 


6. The probability of the range as a Bessel function. The integrals 
(12) and (13) may be evaluated by numerical procedures, since tables of the 
function exp(— e~®) are easily calculated However, it turned out to be simpler 
to relate these integrals to the solution of a differential equation. The deriv- 
ative i/''(E) of the distribution (12) is 


i'{.R) = —^{R) + e"® [ exp[-y - R - 

V— » 


e — e 


— V—R 


]dy 


The integral is equal to the probability ^(E) since the transformation 

y + R = —z 

leads to 


/ exp[— y — R — e" — dy = I exp[z — — e'j dz 

i^OO 

Consequently the probability '^(B) is subject to the differential equation 

+ ^' - = 0. 




(17) 
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The mode of the reduced range is a fixed value^^B^such that 

(18) 1^(S) = e-\{R). 

Mr. W Wasow (Swarthmore College) has drawn my attention to the fact that 
the probability ^'(i?) of the range can be expressed in terras of a Bessel function, ^ 
To obtain tins simplification of the differential equation we introduce a new 
positive variable 2 by 

(19) z = 
and a new function U by 

(20) ^ = U-z. 

The boundary conditions are 

(21) z = 0, ^ = 1; z=a>; 'i' = 0. 

The first derivative becomes, from (19) 

d’J' _ _ z 

dR 2 dz 

whence, from (20) 



The second derivative becomes, by the same procedure 

dR^ 2dz\ 2 2 dz)’ 

The second member may be written 

2 \2 2 da 2 daV 4 4 dz 4 ds^ ‘ 

Thus the differential equation (17) is now 

lEl 4. EE ~ lE 4.E - E - i u = 0 

4 dz^ i dz 2 dz 4 2 4 

Multiplication by 4z~' leads to 

This is one of the classical Bessel differential equations of order 1. In the nota- 
tion used by the British Tables [14] (pp. 264 and 213) one of the solutions is 

(22) Uiz) = Ki(z), 

‘ 1 profit of this occasion to thank him for this and other valuable suggestions. 



mSTBIBUTION OF EANQB 


391 


t* 

where Ki{z), the modified Bessel function of the second kind (Hahkehunction) 
is defined by 


(23) 


Ki(s) = (7 - Ig 2 + Ig z) 


«o -I 

y — - — 

0 V !(v 1) 1 



+ - - £ — - — 

3 -r (i- - 1)' ;»! \2j 


+ 2 + 


+ 


---V 

V 2vJ 


The relation between the functions K,(2) and the Hankelf unction is 


(23a) K,iz) 

\ 

The asymptotic probability for the range is, from (20) and (22) , 

(24) 4'(fi) = zKi{z) 
or, from (19) 

(25) ^{R) = 2e-*'"is:i(2e-“'“). 

Tliis is the only Bessel function satisfying the boundary conditions (21) . The 
asymptotic probability "^{R) of the range may be written finally from (25), (23) 
and (10) 

(26») 1 - *(ffl - i (« - 2v + 2& - 0 

where 

-So = 0; -S, = £ J 

X-1 ^ 


The distnbution 


^(R) = 


M{R) 

dz 


of the reduced range R is, from (24) and (19) 


dz 

dR 


f(B) = -|(^Ci(z) +zK[{z)). 

Now, the derivative K'\{z) is linked to the modified Bessel function Kniz) of 
the second kind and of order zero by 

zK[{z) KM - nKM. 

Consequently the distribution is 

(26) ^{R) = |■K^o(^) 
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or, from (19) , 

(27) ^p{E) = 2e~"Ji:o(2e-“'') 


where the function Ka(z) is defined by 


(28) Ko(z) = - (t - la 2 + Ig g) E (I 


] 

v[ v\ 



.'+2 + 



Finally the asymptotic distribution t/<(ij) of the reduced range may be written 
from (27) and (28) 


''(28a) m = E (22 - 2t + 2^,) 

0 v\ VI 


We first investigate the analytic behavior and the order of magmtude of the 
probabihty ^{R) and the distribution f (22) for large negative, and large positive 
values of the reduced range, i.e. for large and small values of the positive variable 
z. If 2 is so large that 


(29) 


2-’ = 


g(3E/2) 


« 1 


the expressions for Kiiz) and Koiz) become [14], p. 271, 




The probability 4^(22) becomes, from (24) and (19), 


(250 


'$■(22) = -v/tt exp 







The condition (29) holds, say, for 22 = —4. The numerical calculation leads, 
for '$(—4), to the order of magmtude 10~“. 

In the same way, the distribution ^(22) becomes, from (26) and (19), for large 
negative reduced ranges 


(270 


i/'(22) = Vt exp 


322 _ 9 -(«/ 2 ) 


c"” , 9 A 
“ ) 


This expression cannot be obtained from (25') since the approximations for 
2 fo( 2 ) and Ki(z) used do not fulfill the relations between the derivatives given 
above. The order of magnitude of ^( — 4) is 10“®. 

Thus the probability '$'(22) and the distribution ^(22) may be neglected for 
22 S —4. This removes the importance of the lower limit 22 ^ —2au stated 
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m (lOO- If ^ 2, the distribution of the range may be dealt with as if it 
were practically unlimited toward the left. 

For large positive reduced ranges to which correspond small values of z, say 

(290 0= = « 1 


the Bessel functions Ki{z) and Ka{z) become, from (23) and (28) 

(23') K,w.(,+,g|)(|+L;)+!-(i+|i-) 

(28-) M - -(T + l6|)(l+t‘)+J + |l. 

In this case we are interested to know how far the probability ’J'(B) diHers from 
unity. Consequently we calculate 1 — ’®'(iJ) and obtain, from (24) and (230 

The right side becomes, from (19) 

2 .- [(-.+f)(i+5r)+i + ?,-«] 

= e-* [(iE - 2r) (l + + 1 + 


or 

e 


“[fE -27 + 1 - 2t + |)]= e-’^iR - 2y + l)(l + y) + 


4 ■ 


If R is so large that 


« 1 


we simply have 

(25'0 1 - ^(iE) = e“"(ZE - 27 + 1) • 

For example, for R = 10, the preceding condition is satisfied and 1 — ip'(IE) is 
of the order 5.10'"* 

In the same manner we calculate the density of probability 4/{R) for large 
reduced ranges. From (26), (19) and (28"') we obtain 

m = 2C-" [(I - 7) (1 + e-“) + e-« + . 

By neglecting <5; R, the right side becomes 

e““[(iE ^ 27)(1 + e~") + 2e““] = e“''[lE - 27 + e”^(iE - 27 + 2)] 
whence 


xP(R) = e~\R - 27)'(1 + + 2e~'“. 
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In first approximation we obtain 

(27") HR) = e-\R - 2y) 

a formula wliicli may also be derived directly from (25"). The density of 
probability is of the order 10~* for R = 10. 

From the formulae (25') and (27') valid for large negative values of R, and 
from the formulae (25") and (27") valid for large positive values of R follow the 
boundary conditions 

HR) -tEii) HR) R - 2y 


lim 77 ^, = e 


lim 


’ fi_T~ 1 - ^(R) R- 2y + I 

For the construction of tables of the distribution HR) and the probability 
HR) 0^ reduced range it is sufficient to consider the interval 

-3 < 72 < 10. 


The two functions Ki(z) and Ko{z) have been tabulated [14] and [19]. Hence 
the probability and the distribution could be calculated from such tables of the 
Bessel functions. This procedure, however, was only used to obtain boundary 
values The tables I and la are based on computations made in the Calculation 
and Ballistios Department at the Naval Proving Ground Dahlgren by stepwise 
integration of the differential equation (17) using the special Relay Calculator 
of the International Business Machines Corporation,® 

Table I gives the probability ^(R) (col. 2) and the distribution HR) (col. 4) 
for the reduced ranges — 3 g 72 S 10.6 in intervals A72 = 0.6. The differences 
ASE^ given in col. 3 are taken from the original figui’es. 

For different uses it is necessary to know the reduced range as a function of 
its probability. This relation is shown in Table la. The first column gives the 
probability, the first line gives the last decimal of this probability, and the cells 
give the reduced range corresponding to the probability obtained from, the 
combination of the first column and the first line. For example. The reduced 
range 72 = —3.20 corresponds to the probability 'F(72) = 0.0002, and the reduced 
range 72 = 10,44 corresponds to the probability 4^(72) = 0.9997. 

This table may be used for obtaining the percentage points of the reduced 
range. The mode R, the median B calculated by the Naval Proving Ground 
and the mean 72 obtained from (14) and (10) are 

(30) R = 0,606366440; 72 = 0.928697642; R = 1.154431330 

A probability paper for the range may be constructed in the following way: The 
observed ranges w are plotted on the vertical axis; the reduced ranges 72 on a 
horizontal axis. The abscissa shows the probabilities 

^( 72 ) = Giw) 

’ The author wishes, to express his sincere appreciation for the permission to use these 
computations The original tables give the probability and the distribution to 8 significant 
decimal places at intervals AR = 1/100 Lack of space prevents the reproduction of these 
tables, 
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TABLE I 


Asymptotic Probability and Asymptotic Distribution of the Reduced Range 


1 

2 

3 

4 

Reduced Range 

R 

Probability 

^ (R) 

Difference 

A'®' 

Distribution 
i' (R) 

- 3.0 

.00050 

.00274 

.00212 

- 2.6 

.00324 

.01032 

.01057 

- 2.0 

.01356 

.02693 

.03386 

- 1.6 

.04048 

.05251 

07705 

- 1.0 

.09299 

.08141 

.13419 

- .5 

17440 

.10533 

.18969 

.0 

.27973 

.11821 

.22779 

.5 

.39794 

.11859 

.24075 

1.0 

.51654 

.10891 

.23021 

1.5 

.62545 

.09327 

.20346 

1 1 

2.0 

.71872 

.07557 

.16898 

2.5 

.79429 

.05860 

.13360 

3.0 

.85289 

.04386 

.10157 

3.5 

.89675 

.03192 

.07483 

4.0 

.92867 

.02270 

.05375 

4.5 

1 

.95136 

.01584 

.03783 

5.0 

.96721 

.01089 

.02618 

5.5 

.97810 

.00739 

.01787 

6.0 

.98549 

.00496 

.01205 
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TABLE I — Concluded 


1 

2 

3 

4 

Reduced Range 

Probability 

Difference 

Distribution 

R 

S? (R) 

A ’ l ' 

vt- (R) 

6.5 

.99045 

.00330 

.00805 

7.0 

. .99375 

.00218 

00534 

7,5 

.99594 

.00143 

.00351 

8.0 

.99737 

00093 

00230 

8.5 

.99830 

.00061 

.00150 

9 0 

.99891 

.00039 

.00097 

9.5 

.99930 

.00025 

.00062 

10.0 

.99955 

.00016 

.00040 

10.5 

.99972 


.00026 


corresponding to the reduced ranges R. If the observations follow the theory, 
the observed ranges are scattered around the straight line 

(10') w = 2u + - 

a 


If the samples are drawn simultaneously, and if there is a constant interval of 
time between the drawings, this interval may be used as unit of time for the 
construction of the return periods T{R) and iT{RI) of a range equal to, or larger 
than (smaller than) R where 


TiR) = 


1 

1 - ^{R) 


iT{R) 


1 

^{R) 


The first (second) notion applies to the range above (below) the median. The 
return periods are shown in an upper parallel to the abscissa. 

A scheme for this paper is given in Pig 3 Such a paper will allow a graphical 
test for the fit of the observed ranges to our theory, and avoids any numerical 
calculations. Obviously this method may only be used if the initial distribution 
is symmetrical, unlimited, and of the exponential type, and if the sample size 
is so large that the asymptotic distribution holds. 
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6. The range, the midrange, and the extremes. The asymptotic dis- 
tribution (27) of the reduced range was obtained by convolution of the asympto- 
tic distributions (5) of the extremes The same method leads to the asymptotic 
distribution of the reduced midrange [8] 

(31) V = a(Xi -f Xn). 

TABLE lA 


The Reduced Range R as Function of Us Prohabihty ib (7f) 


^ (R) 

0 

1 

2 

3 

4 

5 

6 

7 

,8 

9 


— 

*K 

-3.20 

-3 12 


m 

-2 96 

-2 92 

-2.89 

-2.86 


— 

-2.83 

-2.64 

-2.52 

-2.43 

-2.36 


-2 25 


-2.16 

0 

— 

-2,12 

-1 84 

-1.65 

-1.51 

-1.39 

-1 28 

-1 19 


-1,02 

.1 




MR 


-0.63 

■M 

MR 

B9S 

-0 42 

.2 






-0.13 




0.04 

.3 





BjR 




MB 

0 47 

.4 

uyi 




m 

0.72 

mjjM 


Ij^ 

0.89 

.5 

0.93 

0.97 

1.02 



1.15 

1.19 

1 24 

1.28 

1.33 

.6 

1.38 

1.43 

1 47 

1.52 

1.57 

1 62 

1.67 

1.73 

1.78 

1 84 

.7 

1.89 

1.95 

2 01 

2.07 

2.13 

2.19 

2 26 

2 33 

2.40 

2.47 

.8 

2.54 

2.62 

2.70 

2.79 

2 88 

2.97 

3 07 

3.18 

3.29 

3.41 

.9 

3.54 

3.69 

3.85 


4.23 


4 75 

5 11 

5.61 

6.45 

99 

6 46 

6.57 

6.71 

6.87 


7.26 

7.52 

7.85 

8 31 


999 

9.10 

9.22 

9.35 


9.67 

9.88 






* These values have not been calculated. 


On the other hand, the asymptotic distiibutions of the reduced extremes are 
obtained by introducing the transformations 

(32) 1/1 = a{xi + u); = a{Xn - u) 

into formulas (5) . It is interesting to compare these four distributions and four 
probabilities with each other. This is done in Figures 1 and 2. The probability 
and the distribution of the midrange are practically identical with the probability 
and distribution of the smallest value, for small values of the midrange, and 
become practically identical with the probability and distribution of the largest 
value for large values of the midrange. Eig. 2 shows that the asymptotic dis- 
tribution of the reduced range is less asymmetrical than the asymptotic distribu- 
tions of the reduced extremes. 
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Table II contains some charactenstic values for these four asymptotic dis- 
tributions. The first three columns are obtained from previous publications 
[6, 8], The mean range is equal to the range of the means for the extremes 
The median of the range is larger than the range from the median of the largest 
to the median of the smallest value. The mode of the range is slightly smaller 
than the mean of the largest value. These statements hold, of course, onlyjor 
the reduced variates. 



-6 *•£ o I 4 & & lO 

RE.DVXieD V/wPlATE. 

Pig 2 


From the mode R of the reduced range given m equation (30) and the trans- 
formation (10), the mode ib of the range itself is obtained as 

w = 2u -{ — 
a 

whereas the difference of the modes of the largest and of the smallest values is 


Xn — xt = 2u. 



• ( 33 ) 




tAk) RetubM Pe-B-ioo X (r) 
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For a syiRmetrical initial distribution of the exponential type the mode of the 
range converges toward the range of the modes of the smallest and of the largest 
value, provided that the parameter a increases without limit with the sample 
size. Thus this convergence does not hold for all symmetrical distributions, 
The last two lines in Table II give the four probabilities corresponding to the 
intervals from the mean /i minus once (twice) the standard deviation o-, up to 
the mean plus once (twice) the standard deviation The first probability for 

TABLE II 


Characteristics for the 4 Asymptotic Reduced Distributions 


1 

Characteristic 

2 

Largest Value 

3 

Smallest Value 

4 

Midrange 

5 

Range 

Mode 

0 

0 

0 

.506 

Expectation 

7 = .57722 

= -.57722 

0 

2r = 1,15444 

Median 

-lglg2 = .36661 

= - 36651 

0 

929 

Seminvariant char 
function 

r(i - i) 

r(i + 0 

1 

r(i-0T(i-M!) 

P(1 - i!) 

1 

Variance 

^ = 1.64.493 

0 

= 1 64493 

TT^ 

3 

= 3 28986 

First -t second ino- 

|3i = 1 298571 


0 

64928 

raent quotient 

P, = 6 4 

mggm 

4.2 

4 2 

95% Probability 

2 97 

1 10 

2 94 

4 46 

99% Probability 

4 60 

1 53 

4.60 

6 45 

Fill + <r) — Fin — tr) 

72 

72 

72 

.71 

P(/i + 2ff') — Fill — 2cr) 

,90 

.90 

95 

95 


the four distributions is about the same as for the normal distribution. The 
second probability for the range and the midiange is about the same as for 
the normal one 

7. The asymptotic distribution of the range for a symmetrical variate. 

The asymptotic distribution of the range R is, of course, independent of 
the sample size, and parameter-free. Both statements do not hold for the 
distribution g{w) of the range proper which is, from (11) 

(34) ' g(w) = o4'[a(w — 2it)]. * 

In this formula, the range is expressed in the same units as the initial variate. 
The parameters a and u are functions of the sample size n, the function depending 
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upon the initial distribution. From equations (6), (8), (14) follows that an 
increase of the sample size has two influences on the distribution of the range. 
The increase of the parameter u shifts the distribution tmrard the right without 
changing its form, whereas the parameter a influences the shape of the distribu- 
tion. If a increases (decreases) with n, the distribution of the range shrinks 
(spreads) with increasing sample size. If a is independent of n, an increase of 
the sample size does not change the shape of the distribution Only in the first 
case may we increase the precision of the range by increasing the sample size. 
The two parameters thus influence the range in the same way as they influence 
the extreme values. 

To use equation (34) for a given initial distribution and a given sample size, 
we have to determine the expected largest value u and the parameter a as func- 
tions of n We may use the definitions (6), (7), (8) if the initial distribution is 
known and of the exponential type, and if the sample size is so large that the 
most probable largest value is sufficiently near to the solution of (7). 

As a first ■ example, eonsider the so-called logistic distribution. This prob- 
ability IS 


(35) 

The initial distribution is 


$(.'!:) = (1 + c-")- 


<p{x} = <5(a:)(l — ^(x)) 


(350 

and the derivative is 

(35'0 - #(a:))(l - 24>(a!)). 

Equation (6) becomes 


1 + 6 ““ = 


whence the expected largest value 

(36) u = lg(n — 1). 

The most probable largest value for n observations is obtained from (7) . 
This equation becomes, from equation (35) 

{n - 1)(1 - ^(an)) = — 1 + 2i{xn) 
n 


whence 


#($„) = 


n + 1 

Equation (35) leads to the most probable largest value 
(360 S;„ = Ign. 

Even for n as small as 30 the difference between x„ and u is less than 1%. Con- 
sequently the asymptotic form of the distribution of the range may be used even 
for small samples. The two parameters are 


(37) 


u =lgn; 


n -f 1 
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Since a converges toward unity, an increase of the sample size shifts the distribu- 
tion of the range toward the right without influencing its shape; the precision of 
any estimate made from the range cannot be increased by increasing the sample 
size. 

The characteristic ranges introduced in paragraph 5 are obtained immediately: 
the mean the mode w, the median range to and the ranges w a and w 59 

w = Ign + 1 154, w = Ign -i- 506, 

w = \gn+ .929; la m = Ig n + 4.46; w.ss = Ig w -f 6.45 

are parallel straight lines if traced as functions of the sample size n on semi- 
logarithmic paper. 

For the normal distribution we cannot expect such simple results. Here, u 
and a can only be calculated as numerical functions of n although limiting forms 
of these functions are known. The parameter a increases with n, and the 
standard error of the range decreases without limit although very slowly. The 
logistic distribution belongs to the first, the normal distribution to the second 
class of initial distributions of the exponential type. 

The probabilities and the distribution,? of the range for normal samples of 
size 5, 10, and 20 as calculated by E S. Pearson and H. 0. Hartley [16] are 
traced in Figures 4 and 5. Our aim is to trace the corresponding asymptotic 
probabilities and distributions in order to see how far the asymptotic ranges 
differ from the exact ones. However, we have first to settle the preliminary 
question how far the most probable largest value differs from the expected 
largest value u. The moat probable largest value Xn is obtained from (7) which 
becomes, for the normal distribution, 

(38) = (n - l)<p{Xn). 

The results as functions of n are shown in Table HI cols. 1 and 2. The 
expected values u obtained from '(6) are given in col. 3 For small samples, the 
two values and u differ widely, as might be expected We are inclined to 
conclude that the asymptotic distribution of the range cannot hold for small 
samples. However, the only legitimate conclusion to be drawn is, that we can- 
not calculate the two parameters in the way stated before (6) and (8) . Instead, 
we estimate them diiectly frpm the observations. The question of the most effi- 
cient estimates of these parameters is not yet solved. The simplest way is to 
use the mean range and the standard deviation of t^e range it„,„ as given by 
Tippett [20] and Pearson [15]. To distinguish these estimates from the asympto- 
tic values, we write the estimates with an index n From (14) we obtain 

(39) -i = o-„,n ; 2un = Wn — — . 

an TT OCn 

Table III gives the calculated means w„ and standard deviations <t„,„ of the 
range, and the estimates 1/an and 2 m„ . Fig. 6 shows how the most probable 



404 


E. J. GUMBEL 


largest values a'„ approach the expected largest value u ivith increasing sample 
size, The estimate «„ quickly approaches u. Besides we trace the mean range 
, the standard error of the range o-„.„ , and l/o;,, which is proportional to it. 



From col, 8 follows that the condition ^ 2 is fulfilled from w ^ 6 onward. 
The ranges obtained from the transformations 


(40) 


w = 2v,„ + — 
an 


are given in Table IV, cols. 3-7. The asymptotic probabilities of the range aS 
obtained from the combination of columns 3-7, and col 2 of Table IV are traced 
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in Fig. 4 as separated points The asymptotic probabilities are situated very 
near to the exact ones Therefore the same method was used to calculate the 
asymptotic probabilities of the range for n = 50 and n = 100 which have not 
been calculated by Pearson. They too are traced in Fig. 4. 



The asymptotic probabilities of the range hold even for small normal samples. 
However, the parameters obtained from the exact distribution differ considerably 
from their asjrmptotic values In other words: The asymptotic probabilities of the 
range hold even for small normal samples provided that the parameters are taken 
from the observations. 

To compare the asymptotic distributions of the normal range to the calculated 
distributions, we attribute the asymptotic differences A'F/a„ for a unit interval 
Aid = 1 to the middle of the corresponding intervals. The results are traced in 
Fig. 5 for ?i = 5, 10, 20, 50, 100. On the other hand, we take the differences 
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A^' for unit intervals from Pearson’s tables, and trace them in the same graph. 
The fit of the calculated to the asymptotic values may be considered satisfactory, 

TABLE III 


Estimate of Parameters from the Calculated Distributions 
of the Normal Range 


1 

2 

3 

4 

5 

6 

7 

8 

Sample 

Largest Value 

Mean Range 

Standard 

Estimated parameters 

Lower 

Modal 

Sd 

Expected 

u 

deviation 



limit 



ffw.a 

1/an 

2Un 

2anU„ 

3 

.765 

.431 


.8884 

.4868 

1.128 

2.30 

4 

.938 

.674 


.8798 

.4851 

1.499 

3.09 

5 


.842 

2.326 

.8,641 

.4764 

1.776 

3.73 

10 

1.419 

1.282 

3.078 


439 

2.571 

5.86 

20 


1.645 

3.735 

.729 


3 271 

8 14 

50 

2.126 


4.498 

.653 



11 34 

100 

2.377 

2 326 



.334 

4.630 

13.86 


TABLE IV 


Asymptotic Probabilities for Normal Ranges Tahen from Small Samples 


1 

2 

3 

4 

5 

6 

1 

Reduced 

range 

R 

Probahilitv 
G{w) = -P/R) 

Normal ranges w 

= 2un + R/«n for sample sizes 

n =* 5 

n = 10 

n = 20 

n = 50 

n = 100 

-3 

.000 

.35 

1.25 

2.07 

3.00 

3.62 

-2 

.014 

,82 

1.69 

2.47 

3.36 

3.96 

-1 

.093 

1.30 

2.13 

2.87 

3.72 

4 30 

0 

.280 

1.78 

2.57 

3,27 

4.08 

4.63 

1 

.517 

2,52 

3 01 

3.67 

4.44 

4.96 

2 

.719 

2 73 

3.45 

4.07 

4.80 

5.30 

3 

.853 

3.21 

3.89 

4.48 

5.16 

5.63 

4 

.929 

3.68 

4.33 

4.88 

5.52 

5.97 

5 

.967 

4.16 

4,77 

5.28 

5.88 

6.30 

6 

.985 

4.63 

6.20 

5.68 

6.24 

6.63 

7 

.994 

5.11 

5.64 

6 09 

6.60 

6.97 


Fig. 6 shows furthermore how the distributions of the range are shifted toward 
the right and become more concentrated for increasing sample sizes. 

As an, example for the practical application of the asymptotic distribution of 
the range, we use an observed distribution of 50 ranges taken from samples of 















3 - T 
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n = 14 normal values given in Freeman's book [5] p. 128 The observed step 
function is traced in Fig. 7. For reasons given in a previous article [7] -we 
attribute the cumulative frequency .5 to the smallest range 3, and the cumulative 
frequency 49,5 to the largest range 18. To compare this step function with the 



4 8 IZ. zo 

Pio. 7 


probability Giw), we estimate the two parameters u„ and ct„ from formula (39). 
The mean range w„ and the estimate of the standard deviation of the ranges 
are 

w = 10.68; s„,„ = 2.93. 

Consequently we obtain, from (39) 

- = 1.61; 2m„ = 8.82. 

an 
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The theoretical ranges are thus, from (40), 

w = 8.82 + 1.61 if. 

The corresponding probabilities 6r(«)) taken from Table I are traced in Fig. 7. 
The fit of the theory to the observations is certainly satisfactory, especially if 
we take into account that the ranges are given in integer numbers only. 

8. The mth range and the asymmetrical case. An obvious generalization 
of the theory as established in paragraph 4 consists in the construction of the 
asymptotic distribution of the mth range for an unlimited symmetrical distribu- 
tion of the exponential type. The mth range is the positive distance from the 
mth observation from above, Xn , to the mth observation from below, mX We 
suppose m to be very small compared to the sample size Under the conditions 
stated in the beginning, the joint distribution Xm) of the mth extreme 

values splits into the product of the asymptotic distribution of the mth extreme 
value from above, fmixm), by the asymptotic distribution of the mth extreme 
value from below, mfimx) . Here, [6] 

fm{x„) = a„, exp [-ma^{x^ - wj - mr"™'**"""’] 

mfLx) = am exp [ma„(„a: + 

The sample size must be so large that the most probable mth extreme value Xm 
is sufficiently near to Um which is defined as the solution of 


= 1 - !? . 

n 


The factor am defined by 


y(llm) 

= 1 - ^{Vm) 


is related to the asymptotic standard error Cm of the mth extreme value by 


Oim 




The joint asymptotic distribution toimX, Xp,) of the mth smallest value and the 
mth range 

(41) 


Wm — 


IS 


to(„a:, Wm) = al exp [- ma„(«„ - 2itJ - . 

The asymptotic distribution g(w,^ of the mth range is, dropping the index m of 
the variable mX, 




_ 2 


.+0O 

I exp [- 


me 


me 




'] dx. 


,am(T+u„) _ 
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Again we introduce a reduced range R„, defined by 
(42) “ 224jn) = Rm ^ 

and put for the integration 

an, (x -f M„.) = y. 

Then the asymptotic distribution of the reduced ?nth range is 

/ -}-« 

exp[— me" — dy. 

'OO 


The probability 4' (22m) for the mth range 


Sami' in 


dz 


cannot be reduced to a single integral This is duo to the fact that the proba- 
bilities of the mth extreme values cannot be written down except in the integral 
form [6]. No differential equation similar to (17) exists. However, the function 
(43) could be calculated by numerical methods. The mean Rn , the generating 
function and the moments of the mth range have been given in a previous 
paper [8], 

For sake of completeness, consider finally an unlimited asymmetrical initial 
distribution of the exponential type In this case, the joint distribution of the 
smallest and of the largest value splits again, for large samples, into the product 
of the asymptotic distributions fi{xi) and /n(a:„) of the smallest and of the largest 
values which are now [6] 

= ai exp [ 011 ( 2:1 - ui) - ; 

fnixn) = a„exp[ — q:„( 2 ;„ — wj — 


Here, and are defined, as previously, by (6) and (8), The sample must 
be so large that the most probable smallest value x, is sufficiently near to the 
solution of 


$(mi) = - . 
n 


The factor a; defined by 


0(1 = 


$( 1 * 1 ) 


is related to the asymptotic standard error of the smallest value by 


Ve- 

The joint asymptotic distribution of the smallest value Xi and the range w 

h)(2;i , w) = Q!ia„ exp[Q:i(2;] — iii) — a„(2;i -\-w — Un) — 
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contains four parameters instead of the two which exist in the symmetrical case. 
However, the number of parameters may be reduced to one. We introduce a 
reduced range R defined by 

(44) R = a„(w — 4- wi) 

being the range itself minus the range of the modes divided by a factor pro- 
portional to the standard error of the largest value. If we put 

(45) ai(xi - Ui) = y; — = /3 

ai 

the distribution yp{R) of the reduced range becomes, in the asymmetrical case, 

(46) ypiR) = f exp[i/(l — jS) — e" — dy 

and the probability 4^(fi) for the reduced range is 

(47) 'l'(fi) = f exp[?/ — e’' — dy 

J—ec 

a formula which may immediately be verified by differentiation with respect to 
R. The mode R of the range is the solution of 

\I/{R) = e~'‘ f exply(l — 2^) — R — e’' — dy 

Contrary to the symmetrical case, the latter integral cannot be expressed by the 
probability, and no simple differential equation similar to (17) exists The ex- 
pressions (46) and (47) contain a single constant d measuring the asymmetry of 
the initial distribution In the symmetrical case, (i = 1, we obtain, of course, the 
previous formulas (12) and (13). In the asymmetrical case, the mean, the 
variance, and the higher moments of the mth range may be derived from the 
generating function given m a previous paper [8]. 

The asymptotic distribution of the mth range in the asymmetrical case can 
easily be obtained by combimng the two procedures used in this paragraph. 
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Addition at proof reading: 

G Elfving’s article “The asymptotical distribution of range in samples from a normal 
population”, Biometnka, Vol 36 (1947), appeared when this manuscript wrs ready for 
print. Elfving considers a probability transformation of the range whereas we deal with 
the range itself, liis distribution requires the knowledge of the initial distribution and 
of the sample size, whereas this knowledge is not required in our asymptotic formula 
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1. Summary. The means, variances, and covariances for samples of size 
< 10 from the normal distribution, a selected long-tailed distribution, and the 
uniform distribution are tabled and compared with the usual asymptotic ap- 
proximations The methods of computation used and the accuracy expected 
are discussed Use is made of the representation of an arbitrarily distributed 
variate as a monotone function of a uniformly (rectangularly) distributed vari- 
ate. It IS hoped that these tables will encourage experimentation with new 
statistical procedures. 

2. Introduction. Two sorts of statistical procedures have been widely ex- 
ploited in theoretical statistics — ^first the use of linear and quadratic combina- 
tions of the unordered observations and, second, the use of ranked (ordered) 
observations. Statistics based on ordered observations have recently been 
dubbed systematic statistics [2, Hosteller, 1946] Analytic processes and a few 
necessary numerical tables have advanced the study of the first procedure greatly, 
at least for the special case of the normal distribution, but analytic procedures 
have not done much to exhibit the behavior of systematic statistics and the neces- 
sary tables have been lacking 

It would be very helpful to have (1) at least the fiist two moments (including 
product moments) of the order statistics, and (2) tables of the percentage points 
of their distributions, for samples of sizes from 1 to some moderately large value 
such as 100 and for a large representative family of distributions. This is a 
large order and will require much computation 

The first step in this direction was taken by Fisher and Yates [1] by tabulating 
the means, to two decimal places, of all order statistics from normal samples of 
size < 50. The present paper continues the process by supplying all means, 
variances, and covariances for samples of size < 10 from (a) the normal dis- 
tribution, (5) the umform (rectangular) distribution, (c) a special distribution 
with long tails For purposes of comparison, we also supply approximate 
means, variances, and covariances for the uniform and the special distribution 
computed from suitable asymptotic formulas. 

The special distribution has the representing function 

(1) r{u) = (1 — 

where u has the uniform distribution on the interval [0, 1], and x = r{u) is the 
variable whose order statistics interest us. This special distribution was es- 
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pecially constructed 1) to have liigh tails and 2) to provide moments of order 
statistics in closed form which could be evaluated with a reasonable amount of 

labor. The normal distribution is rather unreasonable in this latter respect 

there being no known expression except in terms of single and double quadra- 
tures of some considerable munencal difSculty. 

We have restricted ourselves to samples of size < 10, and to only three dis- 
tributions, all of these symmetrical, because of limited man-power rather than 
limited interest Additional tables of a similar nature will surely prove helpful. 

In order to obtain even as much information as provided in this paper, it has 
been necessary to make a joint effort, dividing the labor. The various parts of 
the work have been carried out more or less separately by the various authors — 
the means and variances for the normal by Mosteller, the covariances for the 
normal (wliich, with their double quadratures, required far more time than all 
the other thought and computation combined) by Hastings witlr some assistance 
from Mosteller, the choice of the special distribution by Tukey, and the com- 
putation for it by Winsor. 




3. Results. In this section we provide the various tables that have been 
computed. 

Table I gives the mean and standard deviation of the fth order statistic 
xi% \ n), [or Kim , we use whichever notation seems less likely to confuse and 
agree that x{l \ n) > x{2 | n) > • • > x{k | «)] from a sample of size n drawn 
from a uniform (H), normal (AO, and a special distribution (S). All three 
distributions have been adjusted to have zero mean and unit variance. In 
addition Table I gives approximations for the mean and standard deviation as 
computed from asymptotic formulas for the normal (AN) and the special (A5). 

If J(x) is the density function, the asymptotic approximation for the mean 
m{i I n) of the fth order statistic from a sample of size n is obtained by solving 
the equation 


f f(x) dx = i/(n -jr 1) 


v 


for m(f I n). Similarly the formula used for the asymptotic variance of a!(i! | n) 
is 

z(n — f -f 1) 
n(n + l)*{/[m(f |to)])“’ 

Values are given for ti = 1, 2, • • • , 10 and % = !,••■ 

an entry in the table for means, a missing entry m(n — f -h 1 [ n) = —m(i | w) ; 
if w(i I n) is an entry in the table of standard deviations, a missing entry 

w(n — z -f 1 1 re) = wii | n). 

Table II gives the variances and covariances of the order statistics for the 
normal distribution (N) and the same quantities as approximated by the asymp- 


Ifm (i|re) is 
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TABLE I 


Means and standard deviations of order statistics x{i\n) for uniform distribution 
(U), normal {N), speaal (S), asymptotic normal (AN), 
asymptotic special (AS) 


Mean 

Standard Deviation 

n 

1 

U N S 

AN AS 

V N S 

AN AS 

1 

1 

0 0 0 

0 0 

1.00000 1.00000 1.00000 

1 2533 .9804 

2 

1 

57735 56419 .53493 

' 4307 ,3418 

.81650 .82565 .84490 

9168 .7486 

3 

1 

.86603 84628 80240 

' .6745 5466 

.67082 74798 .82783 

7867 6823 

2 

0 0 0 

0 0 

.77460 66983 .58457 

7236 .5660 

4 



.56569 .70122 .82982 

.7144 .6542 

2 

, 

.34641 .29701 .25540 

2533 1992 

.69282 60038 .52582 

.6340 5035 

5 

1 

1.15470 1.16296 1 12449 
.9674 .8136 

48795 .66898 .83642 

6670 .6415 

2 

57735 49502 42567 

. 4307 .3418 

61721 55814 50390 

5798 .4730 

1 


.65465 .53557 44903 

.5605 .4384 

6 

1 

1 23718 1.26721 1.23847 
1.0676 .9114 

42857 64492 .84423 

.6331 .6330 

2 

74231 .64176 .55458 ' 

.5659 .4539 

.55328 .52874 49425 

.5426 .4567 

3 

.24744 20155 16785 

.1800 .1412 

.60609 .49620 41648 

.5147 .4057 
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TABLE I {Continued) 


Mean 

Standard Deviation 

n 

i 

U N S 

AN AS 

U N S 

AN AS 


1 

1 29904 1.35218 1.33506 
1.1504 .9957 

.38188 .62603 .85217 

6072 .6141 

2 

86603 75737 .65892 

.6745 5462 

.50000 .50670 48992 

5160 .4359 

3 

.43301 .35271 .29375 

.3186 .2512 

55902 46875 .39963 

.4826 3772 

4 

0 0 0 

0 0 

57735 45874 .37747 

4737 3617 

8 

9 

■ 

1.34715 1.42360 1 41892 
1.2207 1.0697 

.34427 61066 .85988 

.5867 .6276 

2 

.96225 .85222 74690 

.7647 .6259 

.45542 .48930 .48823 

.4936 4402 

3 


.61640 .44807 .38998 

4584 .3743 

4 

19245 .15251 .12502 

1397 .1094 

.54433 .43264 .35616 

4447 .3494 

1 

1.38564 1.48501 1 49358 
1.2816 1.1358 

.31334 .59780 86725 

.5691. .6268 

2 

1 03923 93230 .82317 

.8416 .6954 

.41779 47508 48800 

4763 4361 

3 

69282 .57197 .47995 

.6244 .4191 

.47863 .43171 , .38414 

.4393 .3722 

4 


.51168 .41303 .34321 

4227 .3356 

5 

0 0 0 

0 0 

.52223 .40751 33173 

.4178 .3268 
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TABLE I (Concluded) 


Mean 

Standard Deviation 

n 

i 

U N S 

an as 

U N S 

AN AS 

10 

1 

1.41713 1.53875 1.56057 
1.3352 1.1956 

28748 .68681 .87423 

5557 .6275 


2 

1.10222 1 00135 89062 

9085 7574 

.38569 .46318 .48859 

4619 .4334 



78730 .65608 .55336 

.6046 .4866 

.44536 41820 .38054 

.4238 .3604 

1 

4 

1 

47238 .37572 30866 

.3488 .2754 

48105 .39756 33477 

4052 .3261 


5 

15746 12274 .09961 

.1142 .0894 

.49793 38857 .31190 

.3973 3117 


totio fonnulas (AN). The asymptotic covariance between x(i \ n) and x(j j n) 
IS given by 

j(n - i + 1) 

n(n + l)‘^f[m(i | n)]f[m(j | n)] ’ ^ 

Symmetry relations exist for supplying the missing entries, 

cov lx(^ I n), x(j I w)] = cov [x(n — i + 1 | «.), x(n — ; + 1 ] r)]. 

It might seem more natural to use the factor n. + 2 rather than n in the denomi- 
nator of the asymptotic variances and covariances so that the formulas would 
more nearly agree with those for the uniform distribution However the use of 
n gives much better approximations for the normal and the special distribution. 

Table III gives the variances and covariances of the order statistics for the 
uniform distribution (U), and Table IV gives the corresponding results for the 
special distribution (S) Table V gives the asymptotic variances and co- 
variances for the special distribution (AS), 

Table VI compares the correlation coefficients between the order statistics 
x(i I n) and x(j | n) for the umform (TJ), the normal (N), and the special dis- 
tribution (S). 

It seems worthwhile to call attention to the following : 

(1) . Even for n = 10, the asymptotic formulas do not give satisfactory mean 
values for the order statistics. 

(2) . For n > S, the asymptotic standard deviations for the normal are close 
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enough to be very useful. For the special distribution we must except the two 
order statistics on each end from this statement. 

TABLE II 

Variances and covariances of the order statistics s(i|«) for the 
normal (N) and the asj/mptotic normal (AN) 



(3). For n > 8, the asymptotic variances and covariances of the normal are 
close enough for many, if not most purposes. 
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(4). For the special distribution, only the variances and covariances of mod- 
erately central order statistics are adequately given by the asymptotic formulas. 

TABLE III 


Variances and covariances for the umfoim distribution [U) 


\ 

n 

\ ^ 

*\ 

1 

2 

3 

4 

5 

6 

7 

S 

9 

10 

2 

1 

66667 

33333 









3 

1 

m 











2 


ill 









4 

1 



16000 









2 


Hi 

32000 








5 

n 

23810 

19047 

14286 

09522 

04762 







WM 


.38096 

28671 

19047 








B 



.42857 








0 


18367 

15306 

12246 


06122 

03061 








,30612 

24490 

.18367 

12245 










36735 

27551 







7 

1 

145S3 

12500 

10417 

08333 








2 


.25000 

20833 

,16667 

BBSS 







3 



,31250 

KiliM 








4 




.33333 







8 

1 

11862 


08889 

07407 








2 


.20741 

. 17778 

14815 

.11852 


mm 





3 



.26667 

.22222 

17778 

.13333 






4 




,29630 







9 

1 


II9S 


06545 

06455 


R : B 





2 


17455 

16273 


wm 


B|^ 

04363 




3 






13091 

Bijffi 





4 




26182 

21813 

17455 






6 





^273 






10 

1 


mm 


■ 

■i 

■i 

m 


01653 

.00820 


2 

Hi 

14871 

.13221 

mm 

BS 


06611 

0495' 

IMISHIli 

1 


3 



.19831 

,1735{ 

1487( 

1239' 


,07431 




4 




.2314( 

1983. 

.1652' 

.1322: 





5 

■ 




2479: 

2006 






(5). The correlation coefficients change rather little from distribution to dis- 
tribution, the poorest approximation being for end order statistics. 
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TABLE IV 

Variances and covariances for the special distribution (5) 
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It is believed that the means are correct to within one unit in the fifth decimal 
and that the standard deviations are correct to within 2 or 3 units in the fifth 
decimal. 


TABLE V 


V artances and covariances of the special distribution as computed 
from asymptotic formulas 


» 

X 

1 

2 

3 

4 

5 

6 

7 

s 

9 

10 

2 

1 











3 

1 

46550 











2 











4 

1 

42792 

20168 

13444 

10698 








2 


.25347 

16898 








6 

1 

41156 

19167 

. 12579 

09606 

08231 







2 


.22368 

.14679 









3 



19221 








6 

1 



12105 

H 

07464 

06679 






2 


20861 

.13629 

mm 

.08341 







3 



16467 

.12343 







7 

1 

37716 

.17627 



06782 


.05388 

HI 




2 



, 12268 


07354 

Kil 






3 



, 14232 


08538 







4 




13731 




■ 



S 

1 

.39389 

18276 

. 11746 

08669 

06935 

.05873 


,04924 




2 


,19382 

12458 

09194 

.07355 

06229 

05538 





3 



14011 

10341 

08272 

,07006 






4 




.12211 

09769 






9 

1 

.39286 

.18226 

11881 



05727 


04556 

04367 



2 



. 12398 




Hi 

.04754 




3 







Hi 





4 




11265 


07512 






5 





BB 






10 

1 

39373 

18242 

. 11677 

08660 

06775 

06646 

04891 

,04379 

04054 

03937 


2 


18784 

12024 

.08813 

06977 

05814 

06036 

04608 

04174 



3 



12988 

09620 

,07536 

.06280 

05440 

.04871 




4 




10633 

08417 

07014 

06076 





6 





09716 

08098 






The evaluation of the covariances was much more troublesome, requiring the 
evaluation of iterated integrals of the form 

f xf(x)F^(x) f — F(i)]' d( dx. 

J--CO V— OO 
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Necessary linear combinations of such forms give nse to considerable loss of 
accuracy. The covariances are believed to be correct to within 1 unit in the 
second decimal (except for one or two values which may be off by two units). 

TABLE VI 

Correlation coefficients X 10^ heiween order sialislics a:(t | u), x{j \ n) for the 
uniform (Z7), normal (N), and special distribution (S) 



Better tables of these covariances are badly needed, and it is hoped that someone 
will provide them. 

The asymptotic values are correct to the ,two decimals given. 
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6. Computation in terms of the representing function. It will prove con- 
venient in working with the special distribution, as indeed it does in many 
statistical procedures, to introduce the representing funcUon r{u), which is a 
monotone function such that 

Pr {r(ui) < a: < riih)] = 'lii — Ui , > ui . 

Thus if u has a uniform ( = rectangular on [0, 1]) distribution then x = r{v) 
defines a variate with the given distribution 
The 4th order statistic of n from the uniform distribution, u,\n , is distributed 
according to 

du, 0 < M < 1, 


where it is important to remember that Ui,„ is the largest and not the smallest 
order statistic; and the joint distribution of u = ut,n and v = u,t„ , (j > ^), 
is given by 

iU - 0 I ■ . ^ - 44)’"' dudv, 0 <v<u<l, 

^ It, J - h'^ - JA “ 

where . ^ , is a multinomial coefficient. 

lt,3 - t,n - 

The means, variances, and covariances which we desire can be written as 
follows (it is immaterial whether we think of expectations over a:’s or over w’s) : 

S(a:.|„) = E{rM) = ^ £ r(i4)i4”-‘(l - w)’"' du, 

var - (S(a:„n))” = E{r\u,„d) - Eix„n? 


= I 


T\u)vr-\\ - uy^Uu - (£?(x,|n))% 


COV (Xi|fi ) Xjin) — . -^^CXtinXj (rt) in) 

= ^(r('W,,n)7'(Wj m)) E{x^lT^E{x^^n') 

= i(j-i)\ ^ ,1 r r r(u)r(vy^’(ii - du dv 

\__lj J 'IjTl 3 ^ Jq Jv 

— E{x^\n)E{Xj\y) 


Introducing E,,t by 


E,,t = f f r(u)r(v)u’ v' du dv, 
Jo Jv 


we have 

E(Xi\nXiln) = t {3 

\ 



k I 



Ek+m, 


n— i—l— fc y 
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and, in particular, 

•E(a:i 12X2,2) = 2 Eo,b , 

= 60JS?2,i — 120®!, a + 60®o,3 ■ 

Introducing E,,, by 

®.,« = f f‘(u)u‘ du, 

Jo 

we have 

®(X?|„) = ^ (”) S (-1)* ~ 

and, in particular. 


.+4 


Introducing E, by 


E(x3\i) — 20 ® 3,8 — 20 ® 4,4 


E, = f r{u)u‘ du, 
Jo 


we have 




and, in particular, 


E(xsii) — 30®2 — 60®8 "t" 30®4 • 


Thus the computation of the desired means, variances, and covariances is 
reduced to the computation of the integrals E, , E ,,, , and ®,, < . 

We shall also want to calculate the asymptotic approximations to the means, 
variances, and covariances of the order statistics. For the uniform distribution, 
it is well known that 


mean = 

var (M,|n) = 


n — i + 1 
71+1 ’ 

i{n —7+1) 

{n+mn + 2)’ 


, . i{n — j 1) 

COV (Ui\nU,\„) = - 


(» < j)- 


(n + l)H.n + 2)’ 

These asymptotic formulas are transformed from m to a: by the relations x = 
r{u) and dx = r'{u) du, giving 

/ \ i f 

approx mean (a;,|„) = r I ^ ) j 


approx var (x,|„) = ( r' ( - ^ ) I 


7(71 — 7 + 1) 

in + 1)^71 + 2) ’ 
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approx cov (a:.|na;j|„) = r' \ ^ 

\ n + 1 / 

, / TO — t + i N Kn — J + 1 
^ \ n + 1 / 


(n + + 2) ’ 


(j- < j), 


as noted above, in our calculations we have replaced n + 2 by n m the denomi- 
nator. 

6. Reduction of integrals for the special case. When the renresenting func- 
tion is 

1 1 


X = r(u) = 


(1 - uf ’ 


(X > 0), 


we obtain a symmetrical distribution with long tails. (For the normal dis- 
tribution r{u) = o(ln u) as M — > 0). The integrals we want are 


E, 


E. 


E, = f {(1 - u)-^ - 
Jo 

, — f {(1 — du, 

Jo 


,i = f f {(1 — u) ^ ^]{(1 — v) ^ — V ^]u’v* du dv, 

Jo Jv 


which can be expressed as 

E. = ^,(X) - B.(X), 

E,,, = A.,.(X) - 2B„.(X) -h C^X), 

Ea,i = .i4.s,t(X) — Be,i{X) — C,,t00 "t" D,,t(X), 

where 

^s(X) = [ (1 — du = 6(— X, s), 

Jo 

1 


B.(X) = [ u-''u’du= — 

A«.e(X) = f (1 — ur‘^u" du = b(—2X, s), 

Jo 

pi 

= / (1 — u)~^u~^u’ du = b(—X, s — X), 
Jo 

a..(X) = du = ^ ^ 1 _ 2 X’ 

A,,((X) = f f (1 - u)~^(l - vr^u’v‘ du dv 
Jo Jd 
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— f — V 1 2X, t) 

~ ^ i+l-X ’ 

n l 

(1 — u)~^v^it‘v‘ du dv 

. 

_ /s'\ /_■)< + 1 — X, ^ — X) _ b{s + i + 1 — X, — X) 

,=o \v i + 1— X i + 1— X 

C. i(X) = f f ~v)~^u‘v dudv 

Jo Jo 


= ■ I I T {&(— X, t) — b(s + i + 1, —X, — X)l, 

B + 1 — X 

n l 

u~^ v~^ u‘ V* du du 

, 

1 

(i + 1 — X)(’s -f- i + 2 — 2X) ’ 


T?here throughout 


Hp, q) 


Pigl 

(p + g + 1)! 


r(p + i)r(g + 1) 

r(p + 9 + 2) 


B(p + 1, 9 + 1). 


7. Calculations for the special distribution. The compulations for the special 
■distribution were made from the formulas in the preceding section. The quan- 
+ties ,4, B, C, D were computed from r = s = 0 to r+s = 8, whence the values 
■of E, , E„ , E,i were calculated The values of the means, variances, and co- 
variances were then obtained from the formulas of section 3. 

The means, variances, and covariances are believed to be accurate to the five 
decimal places given. 

8. Formulas and accuracy for the uniform. The means, variances, and co- 
variances of the uniform are given near the end of section 5. Since r(u) = u, 
they are also the values given by the asymptotic approximation, when n + 2 
IS used. 

The tabulated Values were computed to six decimal places and rounded to the 
four or five decimals given. 
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SEQUENTIAL CONFIDENCE INTERVALS FOR THE MEAN OF A NORMAL 
DISTRIBUTION WITH KNOWN VARIANCE 

By Chahles Stein and Abraham Wald 
Columbia UniDersity 

1. Summary. We consider sequential procedures for obtaining confidence 
intervals of prescribed length and confidence coefficient for the mean of a normal 
distribution 'vvith known variance. A procedure achieving these aims is called 
optimum if it mimmizes the least upper bound (with respect to the mean) of the 
expected number of observations. The result proved is that the usual non- 
sequential procedure is optimum. 

2. Introduction. The problem of sequential confidence sets in general has 
been considered briefly by one of the authors [1]. Let (Z,}, (f = 1, 2, • ■ ■), 
be a sequence of random variables whose distribution is specified except for the 
value of a parameter 6 whose range is a space 0 Sequential confidence sets are 
determined by a rule as to when to stop sampling, together with a function of 
the sample whose value is one of a specified class of subsets of fi. The class of 
subsets IS chosen in advance depending on the purpose of the estimation. For 
example, it may be the class of all intervals of prescribed length or the class of 
all sets whose diameter does not exceed a given value. It is required that the 
piobability that this (random) set covers 6 should be greater than or equal to a 
specified confidence coefficient a for all 8. A procedure for finding sequential 
confidence intervals is considered optimum if it minimizes some specified function 
of the expected numbers of observations. Here this function is taken to be the 
least upper bound. In contrast with the result of this paper, a case where se- 
quential confidence intervals may have an advantage over non-sequential pro- 
ccduies has been given by one of the authors [2] The X* are independently 
normally distributed with unknown mean and unknown variance, and the prob- 
lem is to find confidence intervals of fixed length for the unknown mean. As 
was first shown by Dantzig [3] this cannot be accomplished by a non-sequential 
procedure. Another case where this is true is the problem of finding confidence 
intervals of the form (po , kpa) where k is a specified number greater than 1, for 
the probability in a binomial distribution. 

Let |Z,}, (f = 1, 2, • • •), be independently normally distributed with un- 
known mean ^ and known variance vj . It is desired to specify a sequential 
procedure for obtaining confidence intervals of fixed length I for the mean 
This is provided by a rule according to which at each stage of the experiment, 
after obtaining the first m observations Xi, • • , Xm for each integral value m, 
one makes one of the following decisions: 

a) Take an (m -h l)st obsenvation. 

b) Terminate the procedure and state that the mean lies in the interval 

427 



428 


CHAHLBS STEIN AND ABRAHAM WALD 


(y 2 O) where Y — ®m(Xi , — , XJ), Sm being a measurable real- 

valued function. The serial number m of the observation on which the proce- 
dure terminates is, of course, a random variable and will be denoted by n 
For any relation R the symbol P{R | ivill denote the probability that R 
holds when f is the true mean of Xi . The confidence coefficient of a sequential 
procedure S is defined by 


(1) a(S) = g.l.h. P{Y - il< ^ < Y + U \ 0- 

Denote by noiS) the maximum expected number of observations, i.e. 

(2) na(S) = l.u.b. E(n | S) 

i 

where E(n | S) denotes the expected value of n when f is the true mean and the 
procedure S is used. 

A procedure S will be considered optimum if, for all S' such that ol{S') = 
a(S), 

(3) MS) < MS'). 

It will be shown that an optimum procedure S{v, c) can be obtained as follows: 

a) For all m < r, a fixed positive integer, take another observation. 

b) For m — V, terminate the procedure if 


(4) 




and let Y = ■ (The inequality (4) is used merely as a device for fixing 

the probability of taking v observations, this random event to be independent 
of whether (Y ~ Y ^l) covers given p.) 

c) Otherwise take a (r + l)st observation, terminating the process, and let 


Y = 


1 

r + 1 


v+l 

J^x,, 

1 


Wlien c = 0, this is the usual non-sequential procedure. 

Clearly, 

(5) c)] = P{xU > c]H + [1 - PU-^ > c}]H > 

where 

Also 

(7) no[iS(r,c)] = r + 1 > c}. 

By a proper choice of v and c we can achieve any desired confidence coefficient 
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There is no essential loss of generality in considering only the 


case 0 - 1 = 1 , and this will be done in the remainder of this paper. 


3. A lower bound for no(iS) and an upper bound for a{S). Considei any 
sequential procedure S for obtaining confidence intervals of length I Put 

(8) 5) = P(T - 1/ < ^ < F+ PUl. 

That is, a:(|, S) is the probability that the confidence interval -will cover the true 
mean ^ when the procedure S is used According to (1), 

(9) a(S) = g\.h.ai^,S). 

In order to obtain a loiver bound for «o(/S) and an upper bound for a{S), we 
suppose that the procedure S is applied when f is not a fixed number but a ran- 
dom variable normally distributed with mean 0 and variance i/. Then the 
probability that the confidence interval covers ^ is 

(10) a(<r, S) = S) af^) 

and the expected number of observations is 

(11) S(n \<r,S) = f” 6-**'*'“ E(n I S) g no(S). 

Let pm(?, (S), (m = 1, 2, • • • , ad. inf ), denote the probability that n = m 
when f is the true mean and procedure S is used. Put 

(12) p.(<r, S) = £” ^) dk. 

Since 

00 

(13) E{n I (T, )S) = X) mpm{(x, S) 

m<sl 

we obtain from (11) 

OQ 

(14) X S) i no(S). 

tn=l 

We shall now derive an upper bound for a{(r, S) . Since X, = ? 4- Si where the 
ei are independently normally distributed with mean 0 and variance 1, the joint 
distribution of g and X, , = 1, • • , m), is a multivariate normal distribution 

with 


(15) 


E^ = EX, = 0 
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and covariance matrix 


(16) E 






9. ^ 

ff • • • . 

£ 2 I 1 2 

<7 -i“ i tr 

. 2 2,1 

: <r <7+1 


2 2 
O' <7 


<7"*+! 


Thus the conditional distribution of f given Xi , * • * , Xm is normal with mean 

f (m — l)£r^ + 1 

a?) =,^(1,1, ..,1) 


5 1 1 2 

0- + 1 a 

i 2 1 1 

IT O' + 1 


2 2 
<r 0 - 


2 'I 

(7 

2 

-1 

'x; 

<7 


o“ + l. 


x„ 


Wo-^ + 1 


ma-^ + 1 
(m — l)o* + 1 


mo’ -f- 1 


mo* + 1 mo’ + 1 


mo’ + 1 
(m — l)o* + 1 


mo'’ + 1 wio’ + 1 


X, 


mo’ + 1 


— +-r£x. 

mo’ + 1 1 


and variance 
(18) 


(mo’ + 1)’ 


4 / m \ 2 2 


?no’ + 1 ' 


If Xi , • • • , X„ IS a sequence for which the process is terminated on the mth 
trial, the conditional probability that the interval of length I will cover f is 
clearly maximized by talung 

2 rn 

(19) 7 = S(g|Xi,.--,X„) --^EX. 

and, by (18) this probability has the value 71 (c„) whore H is defined by (6) and 



SEQUENTIATj confidence intervals 


431 


Hence, 

(21) a(<r, S) < E p,n(<r, S)H(c„,) 

From this and (10) we obtain 

(22) a(S) <EvA<^,S)H(,c,n). 

1 

This upper limit of a(S) and the lower limit of no(*S) given in (14) will be used 
later to prove that S(v, c) is an optimum procedure. 

00 

4. Maximum value of E Pmi<y, S)Hicm) subject to the condition that 

1 

00 

2 rrLpj,i{(Ty S) does not exceed a given bound. We shall show that the maximum 
1 

00 

of S jPm(o‘, S)H{Cf,i) subject to 
1 

ao 

E(n 1 <r, /S) = E ‘>npmi<r, S) S v + a, 

1 

where r is a positive integer and 0 < a < 1, is obtained by choosing pmiv, 3) = 
p*, defined by 

Pm = 0 for m < p or m > + 1 

(23) p* = 1 - a 

* 

p„+i — a. 

For, suppose to the contrary that there exists a sequence {pml such that the 
following conditions hold: 

00 

p™ > 0, Epm = 1 

00 oo 

(24) E "iP" < p + o = inpt 

1 1 

EPmH(Cm) > EptH{Cm). 

1 , 1 

We have 

(25) H{u) = A A r e-^'dx = ^ f y-^ 0 -^’' dy. 

y ir Ja V 271 ■'o 

Put 

(26) C = - H{p,) = A= [ ^2/- . 

V27r*’‘=? 
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With the aid of = 1 — Pm , we obtain from the last two inequalities 
in (24) 

OQ CQ 

(27) 0 < 22 (Pm - P«)W(c«.) ~ (Pm — Pm)m = 22 iPm - vl)K^ 

I 1 m^v 


where 


(28) Km = H(cm) - E(c,) -(m- r)[//(c,+i) - W(c,)] 

Clearly K,+i = 0. Also, for m < v, since the integrand is a strictly decreasing 
function of y, 


Km 


(29) 


= (r - m) - / 


y ^e~^''dy 


< (v — m) \ y * e — (v — m) y~^ e 


\y-ci 


l/=c? 


= 0 . 


Similarly for ?m > v + 1, Km < 0. But pm = 0 for m y, v + 1 so that 
(30) E (j>m- pl)Km < 0 


which contradicts (27) since K,+i = 0. 

Thus, we have shown that the inequality 

(31) ^(« I «■, iS) < y + a 
implies the inequality 

(32) f^Pm{<T, S)H{cm) < (1 - a)H(cy) + aH{cy^,)- 

1 

6. Proof that S{v, c) is an optimum procedure. Since, according to (14) 
and (22) 

CO 

(33) no(»S) > E(n \ a-, S) and a((S) < E Pmi<^, S^Hicm), 

it follows from the result expressed in (31) and (32) that, for any procedure S 
satisfying the inequality 

(34) n<,iS) < p + a, 
we must have 


(36) a{S) < (1 — a)H{cy) + o//(c,+i) 

identically in <r. Since H(,u) is continuous, it follows that 

(36) a{S) < (1 - a}H(^V'p 0 + aH(^WT^ 0 

for any procedure S satisfying (34). 
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The right hand side of (36) is a[S(j', c)] where c is chosen so that 

(37) 1 - « = -P{xti > c). 

We use an indirect proof to show that S{p, c) is an optimum procedure. Sup- 
pose to the contrary that there is a procedure S' such that 

(38) ^(S'O = a[;S(r, c)] 
but 


(39) n,(S') < n„[S(r, c)]. 

By (5) and (7), a[fS(r, c)] is a continuous strictly increasing function of 

r -t- 1 — P[-kI-\ > c} 

and this latter is wo[(S(j<, c)]. ^ If we choose v', c' so that 


(40) 

it follows that 


no(SO < / + 1 - P{x?-i > c'l 
< r + 1 — P(x“-l > c), 


(41) «[-S(r',c')] < a[S(v,c)] = a(S0. 

But (41) andjthe first part of (40) contradict the result expressed m (34) and 
(36). 
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NOTES 

This section is devoted to brief research and expository articles on methodology 
and other short items, 

A USEFUL CONVERGENCE THEOREM FOR 
PROBABILITY DISTRIBUTIONS 

By Henry Scheff^ 

University of California at Los Angeles 

In problems of establishing limiting distributions it is often apparent that the 
probability density p„(a:) of a random variable has a limit p(x) ; throughout 
this paper n = 1, 2, 3, ■ • • , and all limits are taken as n ^ oo . If p(j;) fg the 
density of a random variable X, what we really care about then is whether the 
limits apply to probabilities, which involve integrals of the densities: Does 
lim Pr(Z„in S} = Pr{X in Sj for all' Borel sets S, or, does 

(1) lim I pn(x) dx = I p(x) dx ? 

Js Js 

The question is thus one of taking a limit under an integral sign. Perhaps the 
most widely used justification of such a process is the following theorem of 
Lebesgue [1, p. 47; 2, p. 29]: If for a sequence |/n(a;)) of integrable functions, 
lim fn{x) = j(x) for almost all x in S, then a sufficient condition that 

lim / fnix)dx = I fix) dx 
J3 •^S 

is that there exist an integrable function g{x) which uniformly dominates the 

sequence (/^(x) } , that is, [ | < gix) for all n and all x in S, and / (/(x) dx < » . 

J3 

For example, in the excellent new treatise by Cramer the limitii g form of the 
<-distribution is treated as follows [1, p, 252, other examples tn pp., 369, 
371] ; For n degrees of freedom the <-variable has the density 

(2) v„{x) = c„(l + 
where 

(3) c„ = (?iir)"*r(Kn + l))/r(in). 

It is shown fairly easily that lim p„(x) = p(x), the density of Z(0, 1), where 

' In defining the convergence of a sequence of distributions to the distiibution of a dis- 
continuous random variable X it is desirable to modify this requirement so that it is de- 
manded only of sets jS which aie continuity intervals of X [1, p 83]. We are concerned here 
however only with the “absolutely continuous case” where X has a probability density p(s). 
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Nim, c^) denotes the normal distribution with mean m and variance o-“ Then 
to prove 



Cram4r shows that {p„(a:)} is uniformly dominated by an integrable function. 
It is instructive to consider some examples where 

(4) lun f p„(x:) dx 

w— go 

does not equal 

(5) J lim pn(x) dx. 

In the examples (i), (ii), («m), hm pn(x) = 0 for all x and hence (6) is zero for 
all 

(i) Pn(x) = 1 for ~n — 1 <x < —n, zero elsewhere. Then (4) equals 1 for all f . 
(ti) Pn(x) = Ifn for — ^n, < a; < Jn., zero elsewhere. Here (4) equals |for all 

(ill) pn{x) = 2n^x for 0 < a; < 1/n, zero elsewhere. Now (4) is zero for 
^ < 0, umty for f > 0. 

An example m which lim Pnix) 0 is 

(iv) Pn{x) = ^[hn{x) + po(a;)], where is the p„ of one of the above examples 
and po IS a fixed density. Then hm pnix) = ^paix). Now (4) exceeds (5) by 
half the amount it did in the corresponding above example. 

The essential features of these examples could be obtained with normal 
distributions but would involve a little more computation, for instance, N ( — n, 1) , 
N{0, n“), N{l/n, 1/n*), for examples (i), (w), {hi), respectively 

We note that in none of these examples is lim Pn{x) a density. This suggests 
that the trouble might perhaps be prevented by requiring that lim Pn(a;) be a 
density — ^which happens in the case from which we started. This surmise is 
correct. We may formalize the situation as follows* 

Definition. A function f{x) will be called a density if it is non-negative and 

f(x) dx = 1. Here R denotes the whole space of x. 

Jjt 

The reader may think of a univariate density, where a: is a real variable and 
R is the real axis, but theorem and proof run the same for a fc-variate density, 
where a: is a point in a fc-dimensional Euclidean space R. 

Theorem^. If for a sequence { pn{x ) } of densities 

lim p„{x) = p{x) * 

® The hypotheses of this theoiem, while perfectly adapted to applications in probability 
and statistics, would not seem the “natural” ones in real vaiiable or measure theory Pro- 
fessor A. P. Morse has remarked to the writer that, if the theorem has not been stated in this 
form before, it is at least an easy corollaiy of some more general results known in that field. 
Nevertheless our direct pi oof based only on the familiar Lebesgue theorem and using only 
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for ahiost all x in R, then a sufficient condition that 

lim / dx = p(x) dx, 

uniformly for all Borel sets S in R, is that p{x) he a density. 

Proof. Let us write the differeace 

(6) pn{x) - p{x) = 8n{x) 

Then 

(7) S*(a;) 0 

for almost all x in R. Also 

(8) Sndx = p„ dx — ip dx, 

03 Js ‘'-S 

and so it suffices to prove that / S^dx ^ 0 uniformly for all 8 in B, where 8 
henceforth denotes a Borel set. If in (8) we let 5 = B we get 


(9) 


f S„dx =0 
Jr 


I I )) 


Since pn and p are densities. We now split the difference into its positive 
and negative parts: Let 

(10) St = + 1 5n I ), = 

so that 

Sn = 5t + 3- dt > 0 , 8- < 0 . 

Fronl (7) and (10), we find 

( 11 ) in 0 

for almost all x in B, and from (9), 


( 12 ) 


[ Stdx+ i„ dx =0. 
Jr Jr 


very simple manipulations may be of interest to readers of the Annals. Professor Morse 

aJso pointed out that the stronger result lim / | Pn(®) — p(a:) | d® = 0 uniformly for all S, 

Ja 


may be etated. This follows from our proof since 
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By virtue of (6), 8„ > —p. Now if a„ < 0, > -p, and if > 0, = 

0 > —p, and hence in every case 0 > 67 > -p. Since we now have | 8~(x) ] < 

p{x) and f p(x) dx = 1, we may apply” the Lebesgue theorem to get 


lim 


I S„ dx = I 
•In ■ Jr 


= / lim 6„ dx. 


The right member is zero because of (11). It then follows from (12) that 

lim / 8t dx *is also zero. The relations 
Jr 


0<Stdx< 8tdx-^0, 

Js Jr 

0 > [ 8'ii dx > f 8Z dx—*0 

Js Jr 

guarantee that the quantities / 6t dx and / 8~ dx have the limit zero uniformly 

Js Js 

for all S, and hence the same is true of their sum (8) 

Retui'mng to the example (2) , we remark that it is practically obvious that the 
second factor on the right has the limit but it is not qiute so obvious that 
lim Cn = (21r)“^ This situation is typical of many applications where it is 
more difficult to evaluate the limit of “the” constant than the limit of the re- 
maining factors, and one wonders after obtaimng the latter hrmt whether the 
constant is not automatically forced toward the hmit desired for it, and whether 
the direct calculation of its limit could not be avoided. Let us put the question 
as follows: Suppose that 

{p»(a:) = c„/„(x)) 

is a sequence of densities and that 

p{x) = cf(.x) 

is also a density. Then if lim /„(a;) = f{x) for almost aU x, may we conclude 
that lim c„ = c? If so, we could then apply the above theorem without having 
evaluated the limit of the constant or produced a dominating function. Un- 
fortunately the answer to this question is no, as shown by example {iv) above: 


• Although our proof lests on the Lebesgue convergence theorem, this theorem is applied 
nto sCa) and not to Pn(x). While in most cases of practical interest the sequence lpn(a)l 
IB uniformly dominated by an integrable function, it ib possible to devise a simple example 
where this is not tiue and yet our theorem applies Let Pn(a) = 1 for l/(7i + 1) < a < 1 

and for a» < a < o„+i , zero elawhere, where o„ = S 1/z. Then sup j)„(a) = 1 for 

all a > 0, nevertheless lim })„(a) is a density, namely that of the umform distribution on 

(0. 1) 
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If we let f„(x) = h„(x) + po(x), ajid f(x) == po(.x), then lim /„(a;) = f(x), but 
«„ = I and c = 1, hence lim c„ c Employing the assumption that p„(x) 
and p(x) are densities we see 

1/cn = [ M-v) dx, l/c = f /(c) dx, 

Jr Jr 

and hence lira c„ = c if and only if 

(13) lim [ fn(x) dx = f hm/„(a:) dx. 

Jr Jr * 

It follows that in such cases if we wish to establish a limiting distribution in the 
sense (1), we may either prove lim c„ = c, or we may justify (13), say by produ- 
cing a suitable dominating function, but we need not do both. No doubt the 
first alternative would be preferable at all but the most advanced levels of 
teaching or exposition 
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AN EXPLICIT REPRESENTATION OP A STATIONARY 
GAUSSIAN PROCESS 

By M. Kac^ and A. J. F. Siegert 

Cornell University and Syracuse University 

1 . In a paper which will soon appear in the Journal of Applied Physics [1] 
the authors have introduced methods of calculating certain probability dis- 
tributions which are of importance in the theory of random noise in radio re- 
ceivers. 

The complexity of the physical problem and occasional uses of heuristic reason- 
ings may have obscured some of the mathematical points. For this reason the 
authors felt that it may be worth while to illustrate one of the basic ideas on a 
simple but important example. 

2. A stationary Gaussian process is a one parameter family x{t) of random 
variables such that: 

(a) . x{t) is normally distributed; the mean and the variance being inde- 
pendent of t 

(b) . the joint probability distribution of xiti), x%), ■ ■ • , a;(Ir) is multivariate 

Gaussian whose parameters depend only on the differences tj — . 


John Sutton Guggenheim Memorial Fellow 
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We assume, for the sake of simplicity, that the process is normalized, i.e., 
E{xil)]^Q, E{x\t)}=l 

and we define the correlation function p(t) by the usual formula 

p(t) = E[x{t)xit + t)}. 

It is then well known^ that a distribution function a{u) exists such that for all t 

/ OO 

cos UT daiu). 

oO 

3. Let 0 < s, i < T and consider the sjunmetnc kernel 

K{s, t) = p(s — t). 

The fact that <r(w) is non-decreasing implies that the kernel p(s — t) is quasi- 
dehmte, i.e., for every function g(t) on (0, T) one has 

[ f g(s)p(s — i)g(t) ds dt > 0. 

Jo Jq 

Thus the eigenvalues of the integral equation 
(2) [ p(s — l)f(l) dt = X/ (s) 

are non-negative. Moreover, denoting by X, the eigenvalues and by f,(f) the 
corresponding normalized eigenfunctions of (2) we have by the classical theorem 
of Mercer (see [4], in particular part 6 of Ch. I) that 

(3) p(s ~ t) = 

3 

where the series on the right is absolutely and uniformly convergent. It should 
be noted that in virtue of (1) p(t) is a continuous function. 

4 . Let now Gi , G2 , G3 , • • ■ be independent, normally distributed random 
vanables each having mean 0 and variance 1, 

Consider the series 

( 4 ) E 

3 

Since for each t we have 

E = E Kf' it) = p( 0 ) = 1 , 

3 1 

we infer that for each t the series (4) converges in the mean to a random variable 
x{t). Moreover, by a theorem of Kolmogoroff [5], the series (4) converges, for 
each t, to x{t) with probability 1. 


“ See [2] The theorem m question (m a somewhat different foim) seems to have been 
first established by N Wiener in [3] 
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Thus we may write 
(5) = S 

J 

It is now easy to show that j:(t) thus defined is a stationary Gaussian process 
in (0, T) with the correlation function p(r). 

In fact, 

B{'r(s).-c(() j = 2 \f,(.s)f,(l) = p(s - t),0 < s, I < T, 

i 

and conditions (a) and (b) of section 2 follow from the well known properties of 
linear combinations of independent Gaussian random variables Of course, 
we are dealing here with infinite linear combinations but the mean convergence 
noted above, is sufficient to justify the extension to our case 
6. It is moie illununating to tlunlc of the random variables G, as measurable 
functions defined on an abstract set in which a Lebesgiie measure has 
been established (the measure of the whole space being 1). 

The representation (5) can then be written in the equivalent form 

(0) .i(/, to) = 2 («)/,(/•)• 

3 

The equality, as established in section 4, holds for every t. in the sense of mean 
convergence Moreover, by the theorem of Kolmogoroff cited above, and by 
Fubini’s theorem the equality (6) holds for almost every pair (i, u), (0 < i < T), 
in the sense of ordinary convergence 
Furthermore by Mercer's theorem (remember that X; > 0) 

f’’ 

2 Xj*= I p(s — s) ds = T 

1 Jft 

and hence 

2 XPAG]] = 2 f = 2 X/ = 2^ < » • 

Thus 

2 

7 

converges for almost every oi and therefore the series 

(7) 2 V\G,{a>)f,(t) 

t 

converges in the mean for almost every w. 

Combimng this fact with the observation that (7) converges almost every- 
where to x(t, oj) we see that, for almost every cj, the series (7) converges in 
the mean to x(i, w) and that consequently 
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(S) [ ® (i, Cl’) di = ^ \jGj(ai) 

for almost every co. 

It should be noted that (8) could not, in. general, be derived by just appealing 
to Paiseval’s relation. The mam reason is that Parseval’s relation holds only 
for complete orthonormal systems whereas the orthonormal system (/^(O } of 
eigenfunctions may fail to be complete If the kernel p(s - t) is positive- 
definite (in which case all the eigenvalues are positive instead of just non-nega- 
tive) then it is known that the eigenfunctions form a complete set This actu- 
ally, happens to be the case in most physical applications 

6. An important application of (8) is the calculation of the characteristic 
function of the distribution function of the random variable 

( 9 ) I = fx%cp) di. 

Jo 

In fact, 

(10) fi{exp m ) = n siexp = n (1 - 

3 3 

The probability density of I is the Fourier integral 

^ n (1 - ■jfx,)"* 

which, unfortunately, in most oases cannot be calculated explicitely. If 

P(i-) = 

in which case the process is also Markoffian, the eigenvalues X, can be cal- 
culated explicitly^ but in more complicated cases it is quite difficult to deter- 
mine them. 

7 . If p(t) is absolutely integrable and (r(p) absolutely continuous then, setting 

A(u) = <r'(u), 

we have A(u) >0 and 

p(t) = f cos MTiL(M) du = f B(u) du, B{u) = A {u) _ A ( — ^ ^ 

J — 00 •'—00 " 


® See [6], m particular section 4 We take this opportunity to correct two misprints in 
this note In the last' formula on p. 64 M should be replaced by N. Also the limits of 
integration in formula (6) should be 0, s and s, p + q instead of 0, p -b ? and 0, p -b g 
The N.D R C Report 14-305 to which a reference is made has been declassified in the 
meantime. It coiiChitis losuUs uliicli engnuted both [1] and the present note 

•* These and lehi! 0(1 resulrs weic staled in Tie abstract [7] by.M Kac. The paper is now 
being prepaied toi publicniion 
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It caa then be shown* that 

lira i 2 Xj = 27r f B^{u) du = f p^(r) d-j- 

r— >00 1 J J— OO J— 00 

and 

hm ^ L [ B^(u) diL. 

X } J— ^ 

It follows now by standard methods that the characteristic function of 



approaches, as T — i- « , 



where 

i/ = f p{,t) dr. 

Thus, as r — > <», the distribution of (11) becomes normal with mean 0 and 
variance 
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APPROXIMATE FORMULAS FOR THE RADII OF CIRCLES 
WHICH INCLUDE A SPECIFIED FRACTION OF A 
NORMAL BIVARIATE DISTRIBUTION 

By E. N. Obeeg 
Uniuersity oj Iowa 

1. Introduction. Given the normal bivariate error distribution 

(1) d>(^, y) = 

The purpose of this paper is to present certain approximate formulas for the 
radii of circles whose centers are at the origin, which include a prescribed pro- 
portion, p, of errors The formulas are, for given , Oy , and p, 
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(2) Ri - ■\/2(Tx<fy In (l/[] — p]) 

(3) = V (4 + cl) In (1/[1 - p]) 

and 

(4) Ra ~ {ffx + (1/2) In (l/[] — p]) 

In section 3 we present tables of p', the true proportion of errors contained in 
circles whose radii are given by the above formulas. These tables reflect the 
goodness of approximation of each formula to the true radius, R, for 0 1 ^ p S 
0 9 and 0.5 g ^ 0 9 Also, a brief statement is included for the same range 
of p but with 0.1 ^ Vi/o-j g ■ 4 

2. The derivation of the formulas. The proportion p of errors that fall 
within an area A on the a:p-plane is given by 

(6) P = f y) dA. 

•>JL 

If the area is bounded by any member of the family of elipses 

2/2 I . 2/2 ,2 

^ /OzA- V /<^y = \ t 

the above integral may be evaluated directly. The result is 

P - 1 - e^’» 

whence 

= 21n(l/[l - p]). 

Thus the ellipse with semi-axes 

(6) In (1/[1 - p]), Vw's/2 In (1/[1 - p]), 

measured from the origin along the x and y axes respectively, will include ex- 
actly the prescribed proportion of errors. 

Frequently, however, it is desired to know which circles rather than which 
ellipses include a certain proportion of the errors. In this case it becomes 
difficult to obtain a formula for the true radius from (6) unless = <r„ in which 
case R is given by either one of the formulas in (Q) However, a natural ap- 
proximation to make is to equate the area of a circle of radius, say R, to the area 
otthe ellipse whose semi-axes are given in (6) . This gives formula (2), 

Ri = \/2(r,<ry In (1/[1 - p]), 

which can be expected to give a fairly close approximation to true R if is , 
closo to (Tj . If Vi 5^ o-„ , it ha's been shown that this formula underestimates 
tiue R winch is iindobirable in some applications' [1] That is, if Ri is used to 
estimate, suy the radius of a circle to include 50% of the errors (p = .5), it will 
give a value uhich includes less than the desired proportion. The first table in 
the last section gives a numerical verification of this fact. 
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To obtain formula (3) we consider formula (5) when A is a circle of radius S. 
We have 

J r® r^/n^—x^ 

I / v(x, y) dy dz 
0 ^0 


By making the transformation x = cos 6,y — o-yV sin 6, and by carrying out 
the integration with respect to r the above formula becomes 


p = 1 - {2h) 


■ 0 - 2)008281 


dd. 


We let 


and 

Then 


a = + 4 )> /? = W - o'i)/(o'» + Vi), 

Ci/ Vy C j Ox ^ • 


a = R^/olil + £^)| and /9 = (1 — £*)/(! + e^), which is less than unity. 
This substitution will be helpful later m preparing tables. The fact that a* 
IS taken less than oy places no limitation on the final results since we only have 
to interchange axes in the other case. Tho above integral may now be written 


as 


( 7 ) 


'0f^/<l— ^00628) 


p»/2 

p = 1 - f2/7r) I 

fTli 

= 1 - (2/3’)r“ / 

Jo 


The integrand, say F{d), in the last integral of (7) can be shown to be monotone 
increasing from to as 0 vai'iesfrom 0 to Tr/2. Furthermore, it crosses 

the line F{8) = 1 somewhere in this interval and differs but little from it any- 
where if the ratio vi/o-„ is close to 1, smee |9 is then close to zero If, therefore, 
we replace the integrand by F(0) = 1, we have p = 1 — e~“. Hence, if a is 
replaced by R^/{ol + ol) and the result solved for R, we have formula (3), 


Ri ~ V{ol+ tD In (1/[1 - p]). 

Finally, formula (4), 

Rs = (v* + (Ty) V (^) In (1/[1 — p]), 

is obtained by taking the root-raean-square of tho former two. This formula 
has certain advantages over the other two, the most obvious being that Ox and 
oy enter linearly so that it is simple to evaluate for given ox , Oy , and p. Sec- 
ondly it will be seen by the tables and additional comments made in the last 
section that when p = 0 5,^ Rs overestimates true H by a slight amount for all 


* This particular value of p gives the circular probable error In this case Rs = 
0.5887(<r. + oy) 
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values of a^jay , and it gives a fairly close approximation to true R for all p 
when vj, ^ 0 5 

We close this section by making a few brief comments In the first place, 
if any of the above formulas is to be computed from a sample of data, we take 
■\/sa;Y(w — ]) '\/s?/Y(w — 1) as estimates of and vy respectively Fur- 

thermore, we test the significance of these statistics by known formulas [2] 

Finally, <tx and vy may be replaced by 

population mean deviation. Thus, for example, 

R, = (D. + Dy) 

3, Tables. The first formula m (7) is useful in testing by means of numerical 
integration the goodness of approximation of the formulas Ri, fia , and Rs to 

TABLE I 


p' computed by means of formula Ri 


/ 

/ 

/ ' 

.1 

2 

25 

,3 

4 

.5 

6 

7 

75 

8 

,9 

.5 

.0988 

1951 

.2425 

2893 

3815 

4720 

.5615 

6508 

.6960 

7422 

.8408 

.6 

.0944 

1974 

2459 

2942 

.3899 

4846 

.5786 

6726 

7198 

7676 

.8668 

.7 

.0997 

1987 

.2480 

.2972 

3950 

.4924 

5894 

.6864 

.7350 

7838 

.8835 

8 

.0999 

.1995 

.2492 

2989 

.3981 

.4970 

5958 

.6946 

.7440 

.7936 

.8935 

.9 

1000 

.1999 

2498 

.2997 

3996 

.4993 

5991 

6988 

.7483 

.7986 

.8985 

1.0 

1000 

2000 

2500 

.3000 

4000 

.5000 

6000 

.7000 

.7500 

.8000 

9000 


the true value of R We construct the tables by replacing E in a by one of these 
formulas, say formula Ri . This gives a = [2 e /(I + e^)][l/(l — p)]- Since 
^ = (1 _ eV(l + l^he right hand side of the formula in (7) may then be 
evaluated for a choice of « and p giving a value we denote by p' . This is the 
actual proportion of errors that is included in the circle whose radius is Ri ■ 
If El gave true R, then p' would be equal to p, so we may regard the difference 
of p and p' as a measure of the error arising when Ri is used to estimate R. 

In the following tables the chosen values of p and e = a^/iTy are listed in the 
first row and column respectively The remainder of the tables include the 

corresponding values of p', , • 

We also have computed tables for 0.1 ^ Fx / <ry ^ 0,4 which we have not in- 
cluded in this paper since for this range of values of vi/vy , all of the formulas 
give approximations that depart considerably from true R except Ea when p = 
0 5. For this case, p' = .4776, 5004, 5109, and 6120 when = 0 1, 0 2, 
0.3, and '0.4 respectively. 
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The difference between an entry m a column and the corresponding value 
of p at the head of the column reflects the error in estimating true R by means of 
Ri , Ri , and Rs . For example, if p is chosen as 5 and trjffy = .7 then Rs 
gives the radius of a circle which includes 50 13% of the errors. Thus 
overestimates true R by including .13% more of the errois. 

By examining the tables it is seen that when 0,1 g p ^ 03,l?i gives the best 
approximation to the true value of R, while Ri gives the poorest. If 0,4 g p g 

TABLE II 


p' computed hy means of formula R^ 


/ 

/ 

/ 

1 

,2 

25 

3 

.4 

5 

6 

.7 

75 

,8 

.9 

.5 

,1215 

.2363 

,2912 

.3446 

4467 

.5432 

6340 

,7217 

7641 

,8060 

.8907 

.0 

.1116 

.2202 

.2732 

3255 

.4274 

.5201 

.0218 

.7146 

.7600 

8050 

8949 

.7 

.1057 

2100 

2616 

.3127 

.4140 

.6136 

.6116 

.7081 

.7558 

8032 

.8976 

.8 

1022 

.2039 

2546 

.3051 

4050 

.5055 

0048 

7034 

.7525 

.8014 

8091 

.9 

.1005 

.2009 

,2509 

3012 

.4013 

.5012 

.0011 

.7008 

.7506 

.8003 

8999 

1.0 

,1000 

.2000 

2500 

.3000 

.4000 

.5000 

.6000 

7000 

,7500 

.8000 

9000 


TABLE III 

p' computed hy means of formula R^ 


\ 

.1 

2 

.25 

.3 

.4 

.5 

.6 

.7 

.75 

.8 

.9 

.5 


.2161 

2674 

.3176 

4152 

.5092 


6887 

.7327 

.7768 


.6 



.2597 



.5059 




,7872 

.8817 

.7 

■»»« 


.2548 

Bna! 

.4046 

,5031 



.7456 

.7937 


.8 

1011 

[^n 


.3020 

4018 




.7483 

7976 

.8963 

.9 




KUjlE! 

RUim 



.6998 

IQ 

.7995 

.8992 

1 0 


Hi 


IH 

IH 


H 






0.75, R} gives the best and fia the poorest; and if 0,8 ^ p ^ 0,9 JSa gives the best 
and El the poorest. Thus formula R» for general use gives the best overall 
approximation. It may be remarked at this point that bounds for the true 
value of R can be found by appl 3 nng two of the formulas, one of which over- 
estimates while the other underestimates R Prom the tables it is apparent that 
this can be done for values of p ^ 0.8 
Finally, these formulas may be used to test roughly the normality of the data. 
For example, if proper estimates’* of tr, and are made from the data, and the 


® Sgo section 2 
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EFFICIENCY OF SEQUENTIAL TEST 

corresponding value of R3 computed for a chosen p, then approximately, the 
proportion p' of plotted errors should fall within the circle of radius R3 . 

B.EPERENCES 

[1] PIenry SaHErr:^, Arm-or and Ordinance Report No A-SSJf, OSKD No 1918, Div 2, 
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A NOTE ON THE EFFICIENCY OF THE WALD SEQUENTIAL TEST 

By Edward Paulson 


Institute of Statistics, University of North Carolina 


The sequential likelihood ratio test of Wald for testing the hypothesis Ho 
that the probability density function isf{X, do) against the one-sided alternative 
Hi that the function is f{X, df) has been shown [1] to have the optimum property 
of mimnuzing the expected number of observations at the two points 6 = 60 
and 6 = 61. Tables showing the actual magnitude of the percentage saving 
of this sequential procedure compared with the classical “best” non-sequential 
test have been calculated (see [1], page 147) for the normal case when 


fix, 6) = 



-(X - 6)^ 
2 


In this note we ivill show that when 0i is close to Oo , the percentage saving is 
independent of the particular function f{X, 8) and the particular values 9i 
and do, so that the tables mentioned above can be used to show the percentage 
saving for any one-sided sequential test involving a single parameter, provided 
/(X, 8) satisfies some weak restrictions. 

Let /(X, 9) be the probability density function of a random variable Let 
E.(n) denote the expected value (when 9 = 9,) of the number of independent 
observations required by the Wald sequential procedure to test the hypothesis 
Ho that 9 = 80 against 9 = 9i = do + A with probabilities a of rejecting Ho 
when 9 = do and |3 of accepting Ho when 9 = 9i Let N be the number of in- 
dependent observations required to achieve the same probabilities a and /3 
for testing the hypothesis 9 = do against = Si by the most powerful non- 
sequential test Let Ua and Up be defined by the relations 



and 
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Wc Will prove the following theorem: 

[«Iog(L^) +(1 -^)Iog (-^^1 


— 0n 


Limit = -2 


(?7a + U^Y 

provided f{X, 6] satisfies the following conditions: 

(A) / J{X, 6) dx can he differentiated twice under the integral sign with respect 

V— gO 

to d 

(B) All four of the integrals 

f(x, ef) dx, 


rjf"(x,6*) 


L\f(x, B*) 





/. 


° f(x, do) 

. /(x, So) 


f(x, B*) dx, 


are continuous functions of B* at B* = 6^ , A sufficient condition for (B) is that 
all the integrals he uniformly cemergent with respect to d* in some interval 6o < 
6* < 6<i + A, and all the integrands be continuous functions of X and 6*. A 

f E 1 

similar theorem holds regarding the limit of < — - > . 

A-.0 iV J 

The proof is as follows; From [1], we know that 


Fo in) 


a log + (1 - a) log ( ff ) 

^ uo 


where 


z = log 


r / fa , gi )] 
Lm 0o)J 


and o(l) — > 0 as A — > 0. 
Now 


Mto pOO 

= I [logfix, Bo + A)]/(x, Bo) dx - / [Iog/(x, 0o)]/(a;, Bo) 

J— 00 V— 00 


dx. 
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Expanding log f(x, da + A) m a Taylor series about A = 0, we have 


log/(a:, 00 + A) = log fix, Bq) + A + — [^11 ^ 

/(a;,0o)^2L P J( 




where 


< 0 * < 00 4 f f/f _ 9) 

’ 00 ’ 002 ’ 


7/ j./2-|e-»* 


ff - r 
- p 




From assumption (A) we find that 


f /'('K, 0o) dx = 0 and f f"(x, Bo) dx = 0, 

J_0O 


while from assumption (B) 


[ Rifix, 

J — 00 


dx — > 0 as A — > 0. 


Therefore 




To find N for the most powerful non-sequential test, we make use of the fact 
(see [2]) that an asymptotically most powerful test for one-sided alternatives is 
given by a region of the type 

TT — - ^ ®o) 

VNhfM 

When A — 1 - 0, A — >■ °o , and since Uj^ is the sum of N independent variates with 

jj- E (U ^ 

a finite second moment, the distribution of 'i_JLL approaches that of a 

'^Uff 

normal variate with zero mean and unit variance. Hence we find the N re- 
quired for a test with Type I and Type II errors a. and )3 by solving for N from 
the relations 

K 

f-l\ / V. ~ Pa 


/ 

r \j /e=s, 

K - VNeM-^ 

\f/e^h 

-k(£ T 

r. \J/e=Sa L \.//i'=«i)J 


= -u^ 
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Now let y 






Now 

E, 


N = 


( ) r^flo) 


Sa) 


, and we find from (1) and (2) that 

U.VW^) + Ui>VE,{y^) - WM'- 
E,{y) 


f(x, 0i) dx 


]• 


_ A f So) ,,, fl N J„ I A f f'i^t So) [ff/ fZ-v 

-^L JM) ^ <*' « * + '^ i. '■t <*' ")1-- * 

= AjEqj/*[ 1 4- o(l)] from assumption B. 

Proceeding in a similar manner, we find 

[U.VW + U^VE,{yy- - = Bo(2/^)[I/a + + o(l))f. 

We now have 


Eo(n) _ A\Eo(y^)]\l + o(l))^ 

N Eo{y^-)[Ua + Us{l + o(l))]“ 

therefore 


a log 


X 


1 - P 


+ (1 - a) log 


-i-) 

1 — a/ 


limit 

4-<Q 


f#} 


= -2 


(‘—0 


- I [Eo(j/') + o(l)] 


+ (1 - a) log 


1 — a 


(u. + u^y 
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A NOTE ON THE POISSON-CHARLIER^ • 
FUNCTIONS 

By C TmiESDEnL 
Naval Ordnance Laboratory 
The polynomials p„(m, z) given by the definition 

(1) P«(m,z) = 

‘ This note was written while the author was employed by the Radiation Laboratory, 
MIT; 



POISSON-CHARLIEE ETINCTIONS 


451 


called the Poisson-Charlier polynomials, and the associated function ^„{m, z) 
given by the definition 

( 2 ) z) = pjjn, z)Pa{m, z), 

/»“■* at”*- 

(3) Mm, z) = j- , 

m' 

occur in statistics. Doetsch [1] has devoted a memoir to them, and they are 
noticed' in Szego’s Orthogonal Polynomials (pp. 33-34). 

I suggest that they are most directly and easily studied in connection with 
the “F-equation” 

(4) ^^F{z,cd = F{z,u + 1), 

whose properties and application to various special functions I have sum- 
marized m a recent note [2] Using the theorems of that note, which I shall 
cite by number, I shall now generalize the Poisson-Charlier polynomials and 
sketch the speediest derivation of their most interesting formal properties. 

Greek letters shall represent unrestricted real numbers, while Latin letters 
shall represent integers. 

From the existence theorem for the F-equation (Theorem 4) we know that 
there exists an integral function of z, F^iz, a), which satisfies the F-equation 
and the condition 

(5) F/s(0, a) = cos(a -|- fi)it • 

From the uniqueness theorem for the F-equation (Theorem 4) it follows that 

(6) Fp{z, n — /3 + 4) = 0, 

(7) = 0, n > 0. 

From the general power senes solution for the P-equation (Theorem 4) we have 
the formula 

(8) Ppiz, a) = cos (a -1- ^)ir /3 + “ + 2)- 

We now define the Poisson-Charlier functions in general by the formulas 

(9) psia, z) = rCa -t- l)z““F^(z, -a), 

(10) M<z, s) = ^ P/3(«. «)• 

From the formulas (6) and (7) we see that [1, p. 263] 

(11) 4i^{—n, z) = 0, n > 0; 
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(12) n + /3 - I, 2 ) = 0, p^(- n + - i,z) =0. 

From the formula (8) we see that 

(13) p^{a, z) = cos (js - a)ir ^ 2 ““iFj( — a; /? — a + 1; 2 ), 

whence it follows at once that 

(14) p^(m, z) = cos /Itt ^ /cK-e)"". 

This is the usual explicit expression for the Charlier polynomials [1, p. 257]. 
From formula (13) we see that 

(15) Vai.-oL, z) = - a)z“y{ct, z). 

In the indeterminate case when <x is a negative integer we see from the formula 
(14) that 

(16) Po(wi, 2 ) = 1, m ^ 0. 

Hence 

(17) <^o(-a, 2 ) = B~'y{a, z), 

(18) Mm, a) = . 

From the definition (10) we now see that 

(19) Mm, 2 ) = Pfim, z)Mm, 2 ), 

a generalization of the formula (2). From the formula (13) and the definition 
(10) we see that 

(20) 4'a(fi,z) = cos (^ - i)r(a ~ ^ + • 

Then by Kummer’s first transformation, 

(21) M^,^) = cos (/3 - a)ir + 1) + i,^-<3 + i;-gh 

from which it follows from the power series formula for solutions of the F-equa- 
tion (Theorem 4) that z) is a solution of the F-equation (4). 

We now have two different solutions of the F-equation based on the Poisson- 
Charlier functions: 

(A) 


F{z, a) = 2 )- 
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(B) F(z,a) =yp^(fi,z). 

From the F-equation it is evident that 

{22) 'Pn(§,z) = 

whence we at once deduce the formula (1). Applying Taylor’s theorem for the 
F-equation (Theorem 8) to the solution (B) we see that [1, p 259] 

(23) 2 + A) = £ ^ \f'a+n(|3, s) ; 

n=0 n\ 

putting a equal to zero we find that 

(24) _sin^ z + h) = E ^ «), 

JiTT lb 

and, more specially [1, p. 260] 

(25) (i+^Tc-^ = I:->«K2). 

^ \ 2/ n-0 n' 

Applying the same theorem to the solution (A) we obtain the formula 

(26) e’'yp0{a, z + h) = E - n,z), 

naO 71 ' 


whence we recover the formula (11) by putting a equal to zero. 
Applying Theorem 9 to the solution (B) yields the result 


(27) 


E F\pa+»{I3, Z) = [ 

ncaO *'0 


e"° i^„(/0, 2 + et) d 9 , 


which contains as a special case the formula 

(28) ±/Vn{m,z) = l,2(l+^))_. 

Appell’s generating, expansion (see Theorem 10, part C or [3, p. 120]) applied 
to the solution (A) yields the result 

(29) E 'P?{^) 2 + = 6*' ^ E 'l'?{i^} y)i j 

71 = 0 


hence 

(30) 


nsO 7l\ 


gCio/(>!+») V ( _1L-Y p ^^^’ 

^0 \z + y/ n' ' 


Putting y equal to zero and using the formula (13) we see that 
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(31) P^in, 2 :) = e' ("l — cos fiir. 

«_o n' \ zj 

Comparing this result with the formula (25) we see that 

(32) = {-yv^{n,z). 

It would be possible to proceed in this same fashion and discover many other 
formal properties of the Poisson-Charher functions, but it is perhaps easier to 
notice from the formula (13) that 

(33) Pp(“j 2 ) = cos (/3 — a)Tr(Q: H- l)z~‘‘L'‘a~'‘\z) . 

L]{x) being Laguerre’s function suitably generalized for complex lower index 
[4, p. 53], By means of this formula every relationship involving Laguerre 
functions may be translated into one involving Poisson-Charlier functions. 
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1. Random Variables with Comparable Peakedness. Z W. Bienbaum, 
University of Washington. 

ft 

Let U and 7 be random variables with symmetrical distributions, i.e, with P{V £ — T) = 
P(U a T) and P{V i —T) = P(V & T) foi all T ^ 0 The random variable U shall be 
called more peaked than U if P(1 1/ | S T) g P(| F 1 ^ T) for all IT 0. Let Xi ,Yi and 
Xi jFs be two pairs of independent random variables such that X, is more peaked than Y, 
for 1 = 1,2. Then under certain additional conditions X = Zi + Xz is more peaked than 
Y=Yi+Y,. 

2. On Optimum Tests of Composite Hypotheses with One Constraint. Eeich 
L. Lehmann, University of California, Berkeley. 

The problem studied is that of finding all similar and bisimilar tost regions of composite 
hypotheses, and of obtaining the most powerful of these regions Vaiious results are ob- 
tained for distributions which admit sufficient statistics with respect to their paiamcters 
Applications are made to the hypothesis specifymg the value of the circulai coiielation 
coefficient in a normal population, and ceitain hypotheses concerning scale and location 
parameters in exponential and rectangular populations 

3. Estimation of a Distribution Function by Confidence Limits. Frank J. 
Massey, Jr , University of California, Berkeley 

Let xi ,X 2 , • , Xn be the results of n independent observations, having the same cumula- 

tive distribution function F(x) Poim the function iS„(x) = k/n where k is the number of 
observations less than or equal to x A confidence band (Sn(x)' ± X/Vii will be used to 
estimate F(x) To determine the confidence coefficient it is necessary to find Prlmax \/n 
I Snix) — F(x) I g X/Vnl It IS sufficient to consider x uniformly distributed in the interval 
(0, 1) Let xVm = s/t wheie s and t are integers Then Sn(x), to stay in the band F(x) ± 
can only pass through certain lattice points above x = r/in, i = 1, 2, ■ • • ,in. The 
probability of Sn{x) passing through a paiticular sequence of these points is given by the 
multinomial law, and this can be summed over all permissible sequences. Limiting dis- 
tributions have been given by A Kolmogoroff, and by N. Smirnoff. It is desired to test 
the hypothesis F(x) = Fo(x) against alternatives F{x) = Fi(z) Using the criterion reject 
Fo(x) if 

max V n 1 Fo(x) — iSn(x)l > X 

X 

the probability of first kind of error can be controlled by choice of X A lower bound to the 
probability of second kind of error against alternatives snob that max [ Fo(3) — i<'i(x) [ i 
A IS given This lower bound approaches one as n — > » Thus the test is consistent 

4. A Note on Sequential Confidence Sets. Charles Stein, Columbia Uni- 
versity. 

Tins iiiiprr gf ncrnlires a paper of Stein and Wald, appearing in the Annals of Math Siat , 
Sept , 10-17 

Let IX, 1, (i = 1, 2, ), be a sequence of random variables whose distribution depends 

on an unknown parameter $ Sequential confidence sets are determined by a rule indicating 
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when to stop sampling and a iiile giving the confidence set as a function of the sample. It 
is desired that, for each sample point, the confidence set should be one of a specified class S, 
that the probability of covci'ing the true parameter should be £ a, and that the least upper 
bound of the expected number of obseivationa .should be minimized. If X. are inde- 
pendent with the rectangular distribution on (0, fl) and S consists of all intervals of the 
form (flo , k6o) with k fixed and 9a a function of the sample, the optiinum sequential pro- 
cedure IS the classical non-acquential procedure If the X, are independently and identi- 
cally diatnliuted in accordance with a multivariate nprmal distribution with known co- 
variance matiix2 but unknown mean fl, and the confidence sols are to be of the foim (fl — 9j)' 
— flo) = r, r fixed, 0o a variable p-dimensional vector, a similar result holds, provided 
the desired confidence coefficient a is not excessively small. 

6. Explicit Solution of the Problem of Fitting a Straight Line when Both 
Variables are Subject to Error for the Case of Unequal Weights. Elizabeth 
L. Scott, University of California, Berkeley. 

Let a, /9 and (i = 1, 2, ■ ■ , s), be unknown fixed numbois and let i). = a -f (3f,. For 
each value of % there exist m, measurements Xij of L and n, measurements y,i of ij, , (j = 
1, 2, ■ ■ ■ , «iv ; A = 1, 2, • • ■ , n,). I’he variables x,, and y.i are normally distributed about 
fl and 1 ), with variances vi/u, and respectively, where the weights w, and Vt are known 
but <r\ and jI aie unknown The numbers -m. and are bounded (usually small) while s 
morcases indefinitely. Thus a, d, vj and o-j appear as struotural parameters and the £< as 
incidental paramotera (See papei by J. Neyinan and E. L Scott to appear in EconomelHca.) 
Modified maximum likelihood equations (MMLB) yielding consistent estimates of the 
stmotural parameters ate tedious to solve when the products m,u, and n,v, depend on i 
The mam result of this paper consists in proving that the varying m,w. and/oi n>ii, can be 
treated as constants Let Wi and Ws be the harmonic means of nuu, and respectively. 
New, MMLE’s written with m,u, = Wi and n,v, = wi yield consistent estimates of a and d- 
The asymptotic variances are also found. An application is made to certain problems of 
astronomy. » 

6, Unbiased Estimates with Minimum Variance. Charles Stein, Columbia 
University. 

Let X be a random variable distributed in the space R according to one of the p d.f 'a 
^(x 1 fl), where 6 is an unknown parameter, and let g(9) be a real-valued function of 9 
Let fl(9) be the set of all x such that | fl) > 0 but {x j 9o) = 0, and S the set of all 9 
such that B(fl) has probability 0 when 9 is the true parameter value Let 

I fl) = fi(x I flo)//>(ii I flo and A(fli ,9i) = Eli//(X | 9i) ^(X | 62 ) I flol 
for 9i , 02 in 3, Suppose d(9i , 9a) is everywhere finite and there exists a set function h 
of bounded variation over S such that 1 jl(fli , 9«) dh(9i) = ffffli). Then an estimate of 
g{9), unbiased for all 9 m S and having minimum variance at flo is given by f(x) = 
/ vj(a: I fl) d\(fl)/v(i: I flo). The minimum variance is 

h 

definition of /(a:) 13 modified at a set having probability 0 when fl = flo, the properties on S 
and at flo remain unchanged Under mild restnetions this alteration can be carried out so 
as to make /(*) an unbiased estimate of fl for all S The results are related to the work of 
Fisher, Dugu^, Bao, and Bhattacharyya on the amount of information 

7. Sufficient Statistics and a System of Partial Differential Equations, (A 

Contribution to the Neyman-Pearson Theory of Testing Hypotheses ) Pre- 


, 


g{9) dX(fl) - [fl(flo)]’ If the 
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liminary Report. Erich L Lehmann, Tlmversity of California, Berkeley, and 
Henry Schepf£, University of California, Los Angeles. 

In the Weyman-Peaison theory of testing hypotheses the problem of the existence and 
determination of similar regions has been tieated under two approaches (1) Assuming the 
existence of a set of aufhcient statistics for the nuisance parameters , (2) Assuming that the 
probability density satisfies a certain system of partial differential equations By solving 
the differential equations it is now shown that they imply the existence of sufficient statistics 
for the nuisance parameters Knowledge of the form of the solution of the differential 
equations permits simplification of the known theory of optimum tests (type B, Bi , etc.) 
as well as some generalization 


8. Power Function of the Analysis of Variance and Covariance Test of a 
Normal Bivariate Population. W. M. Chen, University of California, Berkeley. 

The problem of finding the power function of the analysis of variance and covariance test 
’ of a normal bivariate population, p = 0 and ui = o-j , by means of principle of likelihood 
was reduced to the determination of the distribution function P(L) of the following moment 
problem . 


k r f ” ~ \ 2 } 


iVir,T 


(fc = 1. 2, ...), 


where 


- 1 + 21 + 0 

and a, the argument of the power function, lies in the interval (0,1) and vanishes only when 
the hypothesis tested is true. The moment problem was found and solved by rather tricky 
methods The result is 


P(L) = 


(1 - 





s + 1 


) 


where b = 



9. A Mathematical Model of the Relation between White and Yolk Weights 
of Birds’ Eggs. G. A. Baker, University of California, Davis. 

The purpose of such a model is to find a rational method of estimating a “best line” in 
some sense which will represent the relation between white and yolk weights for some or all 
species of birds Prom data at hand it appears that birds within species may differ in 
means and variances of weights and that the yolk and white weights are positively corre- 
lated Yolk and white weights within a species are functions of egg number. The standard 
deviations of yolk and white weights for different species are approximately proportional 
to mean values. The “true” means for yolk and white weights for different species do 
nol lie on a line because of biological differeneea between species with the same egg size 
The slandard deviation of specie* deviations from a sliaight hue depend on the size of the 
egg (may be proportional to a weighted ’sum of the yolk and white weights) If sampling 



458 


ABSTRACTS OP PAPERS 


variances are sufficiently small they may be neglected and a straight line fitted assuming 
both variables subject to eiroi and non-uniform variance The praetioality of maximum 
likelihood estimates is oonsidercd. 

10. Statistical Analysis for a New Procedure in Sensitivity Experiments. A. 
M. Mood, Iowa State College, and W J. Dixon, University of Oregon. 

In the language of biological assay the sensitivity experiment investigates the proportion 
of subjects that lespond to a given concentration, x, of a certain chemical. It is assumed 
that only one teat may he made on each subject. The new procedure is characterized by a 
change in x for enoh successive tost, depending on the result of the preceding test x is 
reduced to the next lower of a fl-\ed sot of concentrations for the next test if there is no 
response and la incieaaed to the next higlici concentiation if there is a response Observa- 
tions are thus concontiated near the mean and few tests are made for values of x where a 
very large or very small piopoition of subjects would lespond Assuming x is normally 
distiibiited, approximate maximum likelihood estimates are obtained for the moan and 
Btandaul deviation of x These assume a form which is simple to compute. Choice of op- 
timum increments of x for various situations is investigated. 

11. The Relation of Inbreeding to Calf Mortality. W. M. Resan, S. W. 
Mead, and P, W. Gregory, University of California, Davis. 

An analysis of calf mortality in the University of California dairy cattle breeding e.\pen- 
ment is presented. Calves up to 4 months of age that were born singly are included in the 
study Only those stillbirths and abortions from cows free from Brucellosis and health 
and rcpioductivo abnoimalities were considered. A total of 774 Jersey and 258 Holstein 
calves were included. Calves were classified necording to inbreeding coefficients as follows: 
Class I, tho controls 0.0 to 0.1240; Class II, 0.126 to 0.2448; Class III, 0.245 to 0,3740; and 
Class IV, 0 376 and over. Thoic was no relation between the number of abortions and tho 
degree of inbreeding. The stillbirths, too few to be statistically significant, tend to increase 
as tho ooeffloieiit of inbreeding increased. Following birth, however, mortality was corre- 
lated with inbreeding of both males and females but for the males it was gieater than for 
the females in Classes III and IV, but the diflerenco is hardly significant, The Jerseys 
tended to be less viable than tho Holstems. Some of the increased mortality of the move 
highly Inbred animals could be accounted for by the action of two lethal genes; one con- 
trolling an anomaly of the liver, the other an anomaly of the heart, there was no plausible 
explanation for most of it Within sex, inbreeding class, and breed there was considerable 
variation in the mortality of the progeny of different sues. Some of these differences were 
statistically significant. 

12. Observations on Designs for Cooperative Field Tests. P. A Minges, 
University of California, Davis. 

In California oonditions vary so greatly between the principal production areas that it 
is necessary to establish experimental plots in each of the areas if reliable information is 
lo bo obtained regarding oultuial practices. Most of these tests must be oonduoted on 
ranchos m eoopciaiion wuh giowers and local agricultural extension agents. The designs 
ol these lestb should bo icluiixely simple, the arrangement should be adjustable to work 
into the growers’ cultural piactices and to permit the obtaining of yield reooids with a 
minimum of interference to the gicmcis’ operations, yet the design must bo adequate to 
yield valid data. The i andomized block design has piovcd the most useful, although pa'red 
plots, factorials, split-plots and I.nriii squall's have been used successfully undei certain 
conditions The Latin squaie design is uselul when a two-way variation is expected, other- 
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wise it is not usually very efficient. Where yield data are of prime importance, for ureplica- 
tions have been considered moat practical In teats auch as variety trials when faotora 
other than yields are important, two replicationa may be adequate The size of the plot 
has been varied to fit the crop, conditions of the field, and known soil variability Plots 
two rows wide and 60 to 135 feet long often have been used, frequently without guards 
between plots Since it is desirable to include checks (untreated controls) in most teats, 
small plots will reduce the loss to the growers when the treatments prove beneficial. The 
information derived from these testa is of moat interest to growers and county agents so 
the data should be presented in tables that are easily read The variability figure which is 
confusing to most people probably can best be presented as the least significant difference 


13. Population Genetics. N. H Horowitz, California Institute of Tech- 
nology. 

Population genetics attempts to describe the effects on the genetical structure of Meude- 
lian populations of factors such as mutation, selection, migration, and random fluctuations 
due to sampling errois These diverse elements are brought under a common viewpoint by 
considering their effects on gene frequencies Since change in gene frequency is the ele- 
mentary process of evolution, the above factors are causal agents of evolution. Mathe- 
matical models illustrating the interplay of the various elements have been constructed by 
Wright, Haldane, and Fisher. The nature of Mendelian inheritance is auch that gene 
frequencies remain constant in large populations not subject to net mutation, selection, or 
migration pressures Unbalanced pressures initiate evolutionary changes which continue 
until equilibrium is reached at a new level of gene frequencies Equilibrium frequencies are 
determined by opposing preasuiea — e g , opposing mutation rates, mutation opposed by 
selection, etc Equilibrium, stable or unstable, is also possible under selection alone In 
small populations, sampling errors among the gametes produce random fluctuations in gene 
frequencies which, superimposed on the equilibrium values, result in probable distributions 
of frequencies. The latter provide a mechanism for the evolution of characters, especially 
biochemical syntheses, which depend on the simultaneous action of a number of individually 
non-adaptive genes 


14. The Choice of Inspection Stringency in Acceptance Sampling by Attributes. 

J. L Hodges, Je., University of California, Berkeley. 

In acceptance sampling by attributes, the probability p that an item will be defective is 
taken to be a function g(x, y) of the quality x of the population and the stringency y of 
inspection. Let n, the number of items inspected, be fixed, and reject if the number of 
defectives is ^ k. It may then be possible to satisfy a condition on the power function 
with different values of k, by adjusting y properly. This paper is concerned with the choice 
of k and y in such situations. A criterion is given, and it is shown that the criterion is 
approximately satisfied by fc = [ng{xo , y)] where xo separates acceptable and non-acceptable 
values of x, and y maximizes 


3fl(xo, y) 
•' dx 


g(.xa, y)[l - g(io, y)\- 


An asymptotic property of this approximation is shown The method is applied to two 
examples (ti) testing the mean baereiial density x of a liquid by the dilution method, y 
being the volume of liquid incubated, and (b) testing the variance x of a normally dis- 


tributed dimension of known mean m by applying gauges set at wi ± 


1 

y 


The approximate 


solution IS found to be satisfactory in both oases foi m = 20. 
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15. The Application of Learning Curves to Industrial Planning. Preliminary 
Report. James R. Crawfoud, Lockheed Aircraft Corporation. 

Learning cuives are significant factors of analysis in intlustries producing quantities of 
less than 20,000 units of a given article. Slnp-buikhiig and aiiframc manufacture are the 
two largest industries in this class Learning curves occur where job costs are kept either 
by individual unit oi by Jot, and also where achicvcnient is measured against a standard 
Cost per unit plots against oidinal unit number as a straight line on logarithmic graph- 
paper. Learning curves arc used to supplement time-studies, determine the capacity of 
tooling, layout of budgets, and for estimating and bidding The experience of individual 
workers and maiiagomenl arc reflected in these analyses The slope of the learning curve 
IS 1 elated to the amount to be learned Plateaus occur which are related to the hiring of 
new workers and to the relaxing of control measures Other consistent minor patterns 
occur which are related to specific conditions. Equations have been derived and tables 
computed lot live related forms of the learning curve. Graphic methods are satisfactory 
e.\cept for bidding. This study ooveis a simple approach to an important problem of indus- 
trial management The findings in the industnal field may benefit research in the field 
of the psychology of learning 

16. Relative Effects of Inbreeding and Selection in Poultry. W. 0. Wilson, 
University of California, Davis. 

Egg pioduotion rale, fertility, hatcliabilily, and chick mortality records from the Iowa 
State College Poultry Department’s inbreeding project were studied Statistics which 
were calculated from the data included simple and partial regression of traits on inbreeding, 
estimates of heritability by oorrelation between paternal half-sibs and by daughter-dam 
regressions, and aeleotiou differentials The net genetic gam or loss in merit per generation 
was considered to be the sum of the product of selection difleientials and hentabihty, plus 
the product of regression of trait on inbreeding and inoreaso m amount of inbreeding. The 
amount of inbreeding that can bo done m each of the traits was estimated when there was no 
net loss or gam. Of tlie traits studied, the rank was m the following order Hatohability, 
chick mortality, fertility, and egg production. 

17. The Rate of Genetic Gain in Egg Production in Progeny-Tested Flocks 
as a Function of the Interval between Generations. Eveeett R. Dempstee and 
I. Michael Leenbe, University of California, Berkeley. 

The rate of genetic gam in a character for which selection is practiced depends m addition 
to the intensity of seleotion on (1) the accuracy of selection, and (2) the average interval 
between generations These factors are not independent and exercise a pull m opposite 
directions Through the application of Wright’s technique of path coefficients comparisons 
can be made between the expected rates of genetic gam in populations containing varying 
proportions of breeding animals of different ages. The methods used involve the estimation 
of eorrelations between genotypes, and various selection indexes based on individual, sib 
and progeny records m inoullod populations as well as m populations whose lange has beep 
restricted by previous selection. From these estimates the relative efficiencies of different 
age distribution schemes of a breeding population can be determined A specific solution 
for such a situation in a flock bred for egg production will be presented as an illustration of 
the problems and methods used in the study of the genetics of populations under artifieial 
seleotion. 

18. Statistical Criteria of the Effectiveness of Selective Procedures. Prelim- 
ineury Report. R. F. Jaeeett, University of California, Berkeley. 
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The validity coefficient,” the standard erroi of estimate, the index of piedictivc effi- 
ciency, the selection ratio” of Tayloi and Russell, Johnson’s Gamma, and other statistical 
devices have been suggested as indices of the effectiveness of selective progiams. These 
devices all suffer from the deficiency that they do not permit a satisfactorily precise estimate 
of the dollar value of the increased output expected from the selection program and thus 
leave unsettled the question as to whether or not the cost of such a program is 
The relationship between the correlation coefficient on the one hand and the mean value of 
Y for an iinselccted population (F being an objective output-type criterion), the standard 
deviation of Y for an unselected population, and the mean value of Y for the uppei Np in- 
dividuals selected on the basis of their high perfonnance on the selective test X on the other 
hand, provides the basis for estimating the increase in the mean output of a group of workers 
selected on the basis of a testing program yielding any specified validity coefficient with the 
criterion Y Increase in productivity of selected woikers is shown to be a function of the 
validity coefficient, the rigorousness of selection, and the coefficient of variability of the 
output criterion among "unselected” employees 

19. Approaches to Umvocal Factor Scores. Preliminary Report. J. P. 
Guileoed, University of Southern California. 

In spite of the fact that umvocal factor scores are badly needed for various reasons, it 
appears to be impossible by present methods to construct pure tests for some common 
factors. Recourse must therofoie be made to statistical control of component variances 
It is desirable to derive each factor score from a mimmum number of tests . The availability 
of a few umvocal testa makes this requirement fairly easy to satisfy Such tests seive well 
as suppression vaiiables for their common-factor variances where not wanted in other tests. 
Several principles may be invoked as objectives; (1) to maximize the desired variance in 
the impure test, (2) to reduce the undesired variance to zero, or (3) to minimize the undesired 
variance without intolerable loss of the desired variance A secondary objective is to 
assure a combining weight of -M 00 for the test measuring the desired factor, Equations 
for achieving the objectives have been derived and the liimtations and implications of each 
procedure have been noted By means of statistical control, the situation seems hopeful 
for the achievement of umvocal sooies for a fairly large number of unique psychological 
variables. There are implications for experimental psychology as well as for vocational 
testing. 

20. A Note on the Problem of Binary Stars. Elizabeth L Scott, Uni- 
versity of California, Berkeley, 

This paper concerns some of the problems of Trumpler (see next abstract) is the 
radial velocity of the z-th star, i = l, 2,- , s, at <, selected at random, j = 1, 2, ,n. a:,,, 

measurement of f,,, is N (?„, <r,). is random with distribution c — (ft, — fio)®) 
where ki 0 and fto are unknown (1) Test of hypothesis that kx= 0 Case (i) v, known. 
Whatevei the exact test T, its power drCU has derivative /SrlO) = 0. Test maxiimzing 

n 

;9t( 0) 13 that of Trumpler with criterion (s,, — a:,)” > Case (ii) tr, un- 

1=1 

known Whatever the exact test t, di™Ho) = 0, m = 1, 2, 3 Test maximizing is 

n r n “Ij n 

Trumpler’s test ^ (re,, — Xi)'* > ^ (i,, — x,-)“ C, (2) Let ^ (f,, — fto)” = 2 X,(t’'. 

3 = 1 L3=i J 1 = 1 

For constant velocity stars ^ = 0. For others it is a random variable. Since, given A = 0, 

IS distributed as non-central x’‘i a,n integral equation connects the distributions of 
and A Its solution yields an estimate of the proportion of constant velocity stars. After 
estimating the distribution of X, the level of significance can be estimated and also the 
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number n of measurements so that the proportion of constant velocity stars declared vari. 
able ■will be leas than p, specified in advance 

21. Statistical Problems of Spectroscopic Binaries. Robert J. Trumpler, 
University of California, Berkeley. 

Spectroscopic Binaries are stars whose radial velocities, as measured by the Doppler 
shift of spectral lines, show a periodic variation The first problem is to obtain a statistical 
criterion for deciding whether a star with several radial velocity measures, made at diffeient 
times, has a high piobability (larger than a specified limit) of variable velocity and should 
be announced as an object worthy of further study. The second problem is to find the 
percentage of variable velocity stars among a large list of stars with several radial velocity 
measures for each star From the distribution of standard errors only the percentage of 
cases where the velocity variation exceeds a certain limit can be ascertained The third 
problem is concerned with those stars for which a binary orbit has been determined. The 
statistical distribution of these binary systems according to mean distance between the two 
stars and the ratio of their masses can be evaluated within certain limits. 



NEWS AND NOTICES 

Readeis aie invited to siibiml to the Secreia/ry of the Institute news items of interest 

Personal Items 

Mr. Kenneth J. Arrow has been appointed Research Associate of the Cowles 
Commission 

Dr. W. D, Eaten, formerly of Michigan State College, is now Chief, Opera- 
tions Branch, Planning Section, Air Defense Command, Mitchel Field, New 
Yoik. 

Dr Paul T. Bruyere is now Chief of the Medical Records and Statistics Branch,. 
Army Institute of Pathology, Office of the Surgeon General, War Department, 

Dr A. C. Cohen received his discharge from the Army, with the rank of Lieu- 
tenant Colonel, at the beginning of the spring quarter, and returned to his former 
position at Michigan State College, He has accepted a position at the Uni- 
versity of Georgia beginning with the 1947 summer session there. 

Dr Hallett H. Germond has resigned from his position as professor of mathe- 
matics at the University of Florida. He is now Director of Research for the 
S W Marshall firm of Consulting Engineers, in New York City. 

Dr Meyer A Girshick, formerly with the Department of Agriculture, is now 
with the Douglas Aircraft Company in Santa Monica, California. 

Dr Clyde PI Graves has accepted a position as Operations Analyst, Opera- 
tions Analysis, Air Defense Command, Mitchell Field, New York. 

Dr E J. Gumbel has been appointed to an Associate Professorship at Brook- 
lyn College 

Dr Trygve Haavelmo has returned to Norway, and is at the University 
Institute of Economics, Oslo 

Mr. Joseph 0, Harrison, Jr , is now employed as a Mathematician in the 
Computing Branch of the Ballistic Research Laboratories, Aberdeen Proving 
Ground 

Dr Wassily Hoeffding has accepted a psoition as Research Associate, The 
Institute of Statistics, University of North Carolina, Chapel Hill. 

Mr. Cyrus A Martm is now an admimstrative analyst and statistician, as- 
sisting Chief of Personnel Control of Signal Corps, in Washington, D C 

Mr, Jack I Northam has accepted an Assistant Professorship in the Depart- 
ment of Mathematics, Kansas State College, Manhattan, beginning with the 
1947 summer session. 

Professor Henry Scheff6, who has been on leave for the past year, returned to 
his position in the Engineering Department, University of California at Los 
Angeles, in June. 

Mr Edward M. Schrock has accepted a position as Quality Control Engineer 
with the General Electric Company at then’ Erie Works, Erie, Pa. 

Mr. Jerome R. Steen, ivho has licen manager of Quality Control Engineering 
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with the Sylvania Electric Products in Emporium, Pa., has now transferred with 
the same company to Flushing, New York. 


Professor Emeritus Irving Fisher, of Yale University, died April 29, 1947, 
at the ago of eighty. 


In connection with the Atlantic City meeting of the American Chemical 
Society, April 14-18, 1947, a symposium on Statistical Methods in Experimental 
and Industrial Chemistry was held, in which several members of the Institute 
of Mathematical Statistics took part. The following program was presented 
Tuesday morning and afternoon, April 15: 

(1) Introductory Remarks. B L. Clarke 

(2) The Management Viewpoint. George Smith. 

(3) A New Technique for Testing the Accuracy of Analytical Data. W. J. 

Youden. 

Discussion: Grant Wernimont, R. F. Moran, John Mandel, and Roland 
H. Noel. 

(4) Design of Experiments in Industrial Research. Hugh M. Smallwood. 

(6) Statistical Training for Industry. Samuel S Wilks 

Discussion: John Tukey, E. V. Lewis, Churchill Eisenhart, and C. West 
Churchman. 


Preliminary Actuarial Examinations 
Prize Awards 

The winners of the prize awards offered by the Actuarial Society of America 
and the American Institute of Actuaries to the nine undergraduates ranking 
highest in the combined score on Part 1 and Part ‘2 of the 1947 Preliminary 
Actuarial Examinations are as follows: 


First Prize of $S00 

James H. Chung .... . . ... University of Toronto 

Additional Prizes of $100 

James P A, Biggs Yale University 

George Y. Cherlm Rutgers University 

Prank H. David. Harvard University 

Thomas M. Galt University of Manitoba 

Charles P. Pinzka. . , . .... Rutgers University 

Philip C Bapp . . University of Buffalo 

Morton K Schwartz . Brown University 

James G C. Templeton .... . . ... University of Toronto 


The two actuarial organizations have autliorized a similar set of nine prize 
awards for the 1948 Examinations 
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The Preliminary Actuarial Examinations consist of the follomng three examina- 
tions : 

Part 1 Language Apiilude Examinahon 

(Reading comprehenaioii, meaning of words and word relationships, antonyms, 
and verbal reasoning) 

Part 2. General Mathematics Examination 

(Algebra, trigonometry, coordinate geometry, differential and integral 
calculus) 

Part 3 Special Mathematics Examination 

(Finite differences, probability and statistics) 

The 1948 Examinations tvill be administered by the College Entrance Examina- 
tion Boaid at centers throughout the United States and Canada on May 14-15, 
1948. 


Correction 

In the Directory of Members published in Vol XVII, No. 4 (December 1946) 
Professor Joseph Kamp6 de Feriet’s name is listed in the F’s under Feriet. It 
should have appeared in the K’s, under Kamp4 de Feriet. 


New Members 

The following persons haue been elected to membership in the Institute (March 1 to May 30, 

1947) 

Adams, Joe K. Ph M. (Wisconsin) Graduate student and half-time instructor in Psy- 
chology, Graduate College, Piinoeton University, Princeton, N J 

Adams, Walter B. Commumcations Analyst, Civil Aeronuatios Admin., Dept, of Com- 
merce, 8S53 S Ingleside Ave , Chicago 19, III 

Aitken, Alexander C. D Sc (Edinburgh) Professor of Mathematics, University of Edin- 
burgh, SS Stirling Road, Edinburgh 5, Scotland 

Brambilla, Francesco Ph.D (Univ L Boccom) Lecturer in Math Statistics, Institute 
of Statistics, University L Boccom, 6 via Panzacchi, Milano, Italy 

Brown, George Middleton, D Sc. (Michigan) Asst Prof of Math, Mich State College, East 
Lansing, Mich , 833 Cherry Lane 

Bueno, Luiz de Freitas, E E (Mackenzie Coll ) Professor da Umversidade de Sao Paulo, 
Brazil, Rua Itambb 341, Casa IS 

Burke, Cletus J., M A. (U.C.L A ) Res Ass't, Umv of Iowa, Iowa City, Iowa, 118 River- 
side Paris 

Cameron, Joseph M., M S. (N Car State) Room 302 South Building, National Bureau of 
Standards, Washington, D C. 

Carpenter, Osmer M S (Iowa State) Instructor, Mathematics Department, Iowa State 
College, Ames, Iowa 

Castellani, Maria D Sc. (Rome) Visiting Professor, Department of Mathematics, Uni- 
versity of Kansas City, Kansas City 4, Mo 

Chernoff, Herman Sc M. (Brown) National Research Council Pre-Doctoral Fellow, 3003 
Wallace Ave , Bronx 67, N. Y 

Clark, Stanley M.Ed (Saskatchewan) Student and teaching assistant, lS01-7ih St , S,E , 
Minneapolis 14, Minn 
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Cover, John H. Ph.D, (Columbia) Director, Bureau of Business and Economic Res., 
Uriiv ol Maiyland, College Park, Md. 

Dailey, John T. M S (N Texas Teachers Coll ) Res Psychologist .(Aviation), Psycho- 
logical Rbs and Examining Unit, Sqn E, Iiidoctiination Div., Air Training Command, 
San Antonio, Texas 

Darling, Donald A. PhD (Calif Inst Tech.) Teaching Ass't, Calif Inst, of Technology, 
Pasadena 4, Calif. (As of July 1947, Dept, of Math., Cornell Univ , Ithaca, N. Y ) 
Darmois, Georges D.Sc. (Pails) Prof it la PacuHd des Sciences de Pans, 7 Rite dc POrfeon, 
Palis 6, Prance 

Davies, J. Alfred M.A (Alabama) Statistician, Design Eng. Section, General Electric 
Co , WS Hill Avenue, Owneshoro, Kentuchy 

Dunnett, Charles W. M A. (Toionto) Student, 1044 John Jay Hall, Columbia Univ., 
New York 27, N Y, 

Egermayer, Frantisek Sc D. (Charles Univ , Prague) Chief of Section, State Statistical 
Office, J? Bilshaho, Prague VII, Czechoslovakia. 

Flckenscher, Edgar H. A B. (Calif ) Graduate student and teaching ass’t, Univ. of Calif., 
7/jflO Acton St , Berkeley %, GahJ. 

Fraga, Constantino G. Jr. (Sao Paulo) Head, Dept, of Statistics, Instituto Agronomico, 
Campinas (S.P ), Brazil 

Frank, Elmore J. B.A. (Chicago) InsLr. in Statistics, 111. Institute of Tech., and Statisti- 
cian, Commercial Res. Dept., Armour and Co., S4SS Maryland Ave , Chicago IE, III, 
Frisch, Ragnar Ph.D (Oslo) Professor, University Institute oi Economics, Oslo, Norway 
Geary, Robert C. D.Sc Superintending Officer, Statistics Branch, Dopt. of Industry and 
and CnmmciOG, 27 Lecson Park, Dublin, Ireland. 

Goodman, John R. M.S. (Iowa State) Head, Sampling Section, Suivey Res. Center, 
Univ of Mich., Ann Arboi, Mich. 

Gutman, Pierre M A (Columbia) Student, 7 Mountain Ave., Maplewood., N. J . 

Hartline, H. K. M D. (Johns Hopkins) Assoc Prof, of Biophysics, Johnson Res Founda- 
tion, Univ. of Pennsylvania, 36th and Spruce Sts , Philadelphia, Pa 
Hartog, Jacob A (Rotterdam) Rockefeller Follow, ZB Pollen St , Carnbndge, Mass 
Jacobs, Marcus A.B (Penn ) Health Statisticisn, S SOlh Si,, Arlington, Va 
Jeeves, Terry A. A.B (Calif.) Teaching ass’t in math , Univ. of Calif , Heorsf Aiie., 
Berkeley 9, GahJ. 

Kempthorne, Oscar MA (Cambridge, England) Res Assoc. Prof , Statistical Lab., Iowa 
Slate College, Ames, Iowa 

Kendall, David G. M A. (Oxford) Fellow, Magdalen Coll., Oxford, England 
Kent, Leonard MBA (Chicago) Instr. in Statistics, School of Business, Univ. of Chic- 
ago, Chicago 37, 111. 

Kupperman, Morton B S (O.C NY) Statistician, Office of the Surgeon General, War 
Dept,, Z8Z9-Z7th Si , N.W , Washinglon 8, D C. 

Lhati, Elizabeth L. M.A. (Michigan) Statistician, Bur. of Measurement and Guidance, 
Carnegie Institute of Technology, Pittsburgh 13, Pa. 

Levine, Harry D. B.S (Chicago) Instr , Long Island Univ., 18^ W. 98 Si , New York ZB, 
NY. 

Lichtenstein, Morris BA (Michigan) 4811 ' N . Capilol Si., N .B,, Washinglon 

11, D C 

McMillan, Brockway Ph D, (Maas Inst. Tech.) Member, Technical Staff, Bell Tele- 
phone Labs , Murray Hill, N. J 

Marshall, Andrew W. Student, E7S7 University Ave., Chicago 37, III 
Metzner, Charles A. Ph D, (Wisoonsin) Study Director, Survey Research Center, Umr 
of Michigan, Ann Arbor, Mich, 

Norton, John W. B S (California) Lab. Supervisor, Union Oil Co, of Calif , BB29 Mac- 
donald Ave , Richmond, Calif 
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Otter, Richard PhD (Indiana) Instiuetoi, Fine Hall, Princeton Univ , Pnnooton, N J. 
Passes, Helena Rocha Penteado Direlor do Divisao do Depto Estadiial cle Eslatlstica de 
Sao Paulo, Avenida Angelica, 160, Apto 6, Sao Paulo, Biazil 
Priest, Edward I. B S, (Columbia) Student m mathematics, 1;?04 65th St., Chicago, III 
Quensel, Carl-Erik Fil Dr (Lund) Prof, at the University, Lund, Sweden, Linnecjatan 

u 

Rankin, Mozelle M A. (Ohio State) Ass’t Instructor, Ohio State Univ , lOI-Hih Ave., 
Cohmtbusl, Ohio 

Robb, Richard A. D Sc (Glasgow) Mathematics Lecturer and Mitchell Lecturer in 
Statistics, Univ. of Glasglow, Glasgow, W 2, Scotland 
Ruist, Erik Fil kand (Stockholm) Amanuens, Industrieiis utrcdnmgsiiistitut, Stock- 
holm 16 , Sweden 

Sham, Inder M. M A (Punjab) Rothamsted Experimental Station, Harpendon, Herts, 
England 

Schneider, B. Aubrey So D (Johns Hopkins) Ass’t Director, Dept of Statistics and 
Special Sei vices, American Cancer Society, 47 Beaver St , New Yoik 4, N Y 
Seitz, Jin PhD. (Plague) Koufimskd 8, GSR, Piaha XII, Czeohoslovankia 
Simaika, Jacques B. PhD (London) I,ecturer, Faculty of Science, Fuad I University, 
Abbassia, Cairo, Egypt 

Slatin, Benjamm M A. (Columbia) Jr Analyst, Econometric Institute, 179 Peslnne 
Ave , Newaik 8, N, J • 

Suydam, Bergen R. A B (N.Y State Coll for Teachers) Graduate student, Columbia 
University, 1 W 706 St , Shanks Village, Orangeburg, N Y 
Tashmuhamed, Sarymsakov Ph D (Moscow) President of the Academy of Sciences of 
Uzb.SSR, Professor of the University, Tashkend, ul Abdulh Tukaeva I, Tashkent, 
USSR 

Travers, Robert M. W. Ph.D (Columbia) Examiner, and Assoc Prof of Education, 
Bureau of Psychological Services, Univ of Mich , Ann Arbor, Mioh 
Weiner, Sidney B S. (C.C N Y ) Student, New York University Graduate School, 1BS9 
East 17th St , Brooklyn SO, N Y 

Wezehnan, Sol M. A B (Michigan) Graduate student. University of Michigan, Ann 
Arbor, Mich , Burt St., Omaha, Nebr 

Wishart, John D.Sc (London) Reader in Statistics, School of Agriculture, Cambridge, 
England 



REPORT ON THE NEW YORK MEETING OF THE INSTITUTE 

The Twenty-Sixth Meeting of the Institute of Mathematical Statistics was 
held in New York City on Thursday, April 24, and Friday, April 25, 1947, and 
was co-sponsored by the Ameiican Mathematical Society. This meeting was 
devoted to a program on Stochastic Processes and Noise. The attendance of 
190 persons included the following 75 members of the Institute: 

F A, Aotoii,C.B Allcndoorfer, F L Alt,T W. Anderson, Jr , L, A Aroian. W D.Bateni 
Robert Bechhofer, J H Bigelow, D. II Blackwell, Paul Bosclian, G W, Brown, R S. Bur- 
ington, B.H Camp, E W Cannon, A G Carlton, K. L. Chung, P C. Clifford, D D Cody, 
Harald Ciamdr, H D Curry, J. H Curtiss, R L Dietzold, J L Doob, Jacques Dutka, 
Churchill Eisenhart, Benjamin Epstein, Will Feller, M M. Flood, Bernard Friedman, C P. 
Gersohenson, H PI Goode, C PI Graves, E J Gumbel, T E, Harris, Millard Plastay, L IP. 
Herbaoh, P G. Hoel, Mark Kac, R D Keeney, T C. Koopmans, William Kruskal, Jack 
Ladermaii, J. E. Licberinan, S, B Littauer, Melitta Lowy, P J. McCarthy, Brockway Mc- 
Millan, Frederick Mosteller, L P Nanni, P M Neurath, G. E Noether, M L Norden, C. 
0. Oakley, P, S Olmstead, G B. Price, J S. Rhodes, John Riordan, Selby Robinson, Frank 
Saidel, Arthur Sard, F E Satterthwaite, G R Seth, 0 E. Shannon, Jack Sherman, W. A 
Showhart, Rosedith Sitgieavcs, Andrew Sobezyk, Milton Sobel, Emma Spaney, 0. M 
Stem, J. W. Tukey, D. F Votaw, Jr.,B T Weber, S.S Wilks, Jacob Wolfowitz 

The first session, was held on Thui-sday morning, with Professor Carl Al- 
lendoorfer of Haverford College serving as chairman. The following program 
was presented: 

Stochastic Processes— 

Desenplion, Professor J L Doob, Columbia University 

Estimation, Professor Will Feller, Cornell University 

Prediction, Piofessor N. Wiener, Massachusetts Institute of Technology 

This meeting was concluded with a discussion by Dr. H. W. Bode, Bell Telephone 
Laboratories, Professor Mark Kac, Cornell University, and Professor A. Wald, 
Columbia University. 

Dr. S. 0. Rice, Bell Telephone Laboratories, was chairman of the Thursday 
afternoon session. The following program was presented: 

Stochastic Ptocesses in Some Applications — 

In Economics, Dr T IPoopmans, Cowles Commission 
In Insiuance, Professoi IP Cram6i, Yale University 
In Cosmic Radiation, Professor N Arley, Piinceton University 
In Nuclear Physics, Dr. S M. Ulam, Los Alamos Laboratory 

The final session was held on Friday morning with Professor J. W. Tukey 
of Princeton University as chairman. The program was as follows : 

Different Ways of Describing Noise — 

By a Noise Spectrum, Dr. C, E Shannon, Bell Telephone Laboratories 
By ci Single Function, Mr J E Bigelow, Institute for Advanced Study 
By Many Functions, Professor Mark Kac, Cornell University 
Round Table on Interrelations, Messrs Shannon, Bieglow, Kac, and Rice 

P.S DWYER, 
Secretary. 
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REPORT ON THE APRIL MEETING OF THE INSTITUTE IN 

ATLANTIC CITY 

The Twenty-Seventh Meeting of the Institute of Mathematical Statistics was 
held in cooperation with the Eastern Psychological Association, on Saturday 
morning, April 26, 1947, in Atlantic City. This meeting was a Round Table 
on Certain Recent Statistical Developments, and its attendance of approximately 
100 persons included the following 9 members of the Institute : 

F S Acton, J. W Dunlap, Benjamin Epstein, Irving Lorge, P J. McCarthy, Frederick 
Mosteller, P J. Union, F E Satterthwaite, and Emma Spaney 

Professor Bernard Riess of Hunter College was chairman of the meeting. 
The following program was presented; 

Papers Sequential Analysis 

Dr living Lorge, Teachers College, Columbia University 
Staircase Methods 

Dr. Philip J McCarthy, Cornell University 
Inefficient Statistics . 

Dr Frederick Mosteller, Harvard University 

Discussion . Dr. Jack W Dunlap, Psychological Corporation 

Dr Leon Pestinger, Massachusetts Institute of Technology 
Dr, William E Kappauf, Piinceton University 
Dr Joseph Zubin, New York Psychiatric Institute Hospital 

P S DWYER, 
Secretary. 
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REPORT ON THE SAN DIEGO MEETING OF THE INSTITUTE 


The first Western Regional meeting of tbe Institute of Mathematical Statistics 
was held m San Diego, California, June 17-19, 1947, jointly with the American 
Association for the Advancement of Science. The meeting v as attended by 53 
persons, including the following 31 memliers of the Institute: 

G A Baker, Joseph Berkson, Z, W. Biinbaum, H C, Carver, Harald Cramdr, J R 
Crawford, Dorothy Cruden, W. J Dixon, Robeit Dorfman, M W Eudey, Evelyn Fix, 
JohnGurland, J L Hodges, Ji , J. M. Howell, H M. Hughes, E S Keeping, E, L. Lehmann, 
R. H Lien, F. J Massey, G F McEwen, Fiedeiiok Mosteller, S W Nash, Jerzy Neyman, 
Kathryn B. Rolfe, Heniy Scheffd, Herbert Solomon, C M Stein, Zenon Szatrowski, H M. 
Walker, J. D. Williams, Zivia S. Wurtele 

The afternoon session on June 17 was a joint meeting with the Group of 
Former Operations Analysts The following program was presented under the 
chairmanship of Col. Roscoe C Wilson 

Topic Statistical Piohlems in Operations Analysis. 

Papers' Engineering and Statistics at the Pacific From in World War II 

Roger Wilkinson, Bell Telephone Laboratories, New York City. 

Present Organization and Activities of Operations Analysis 

Leroy A. Brothers, Operations Analysis, Asst. Chief of Air Staff-3, Washington, 

D. C. 

Statistical Evidence of Bomb Release Malfunctions, 

Mark W. Eudey, University of California, Berkeley 

Study of Effectiveness of Gertain Bombs Used Against German Industrial Targets, 
J Neyman, University of California, Berkeley 

The morning session on June 18 was presented with Professor Alva R Davis 
as chairman, and the program was as follows: 

a 

Topic ; Statistical Problems in Biology. 

Papers' A Mathematical Model of the Relation between While and Yolk Weights of Birds' 
Eggs 

G, A Baker, University of Califorma, Davis. 

Statistical Analysis for a New Procedure in Sensitivity Experiments. 

W, J Dixon, University of Oiegon, and A. M Mood, Iowa State College. 

The Relation of Inbreeding to Calf Mortality 
P. W, Gregory, University of California, Davis. 

Cooperative Field Trials, 

P. A Minges, University of California, Davis. 

Population Genetics, 

N. H Horowitz, California Institute of Technology. 

Statistical Problems in Assessing Methods of Medical Diagnosis, with Particular 
Reference to X-Ray Technique 

J Yerushalmy, United States Public Health Service, Washington, D. C. 
Discussion. J, Neyman, University of California, Berkeley 

’ Professor John W. Miles was chaii'man of the afternoon session on June 18, 
which was a joint session with the California Section of the American Society 
for Quality Control. The following papers were presented: 
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Topic Industrial Applications of Statistics 
Papers Opei ating Chat actenslics of Average and Range Chai ts 
Henry ScheS^, Univeisity of California, Los Angeles 
Sampling Inspection hy Variables 
Herbert Solomon, Stanford Umversity. 

Some Exact Numerical Results for Sequential Acceptance Sampling by Attributes, 
Mark W Eudey, University of California, Berkeley 
Choice of Inspection Stringence in Acceptance Sampling by Attributes 
Joseph L Hodges, Umveiaity of California, Berkeley. 

Widening Tolerances for Closer Fitting Parts 

Edmond E Bates, Quality Engineering Consultants, Los Angeles 
Discussion, Russell O’Neill, University of California, Los Angeles, 
Re-estdblishing Operator Responsibility for Quality Control. 

Wyatt H Lewis, Geneial Electric Company, Ontario, California 
Discussion William B Rice, Plomb Tool Company, Los Angeles. 

The Application of Learning Curves lo Induslrial Planning 
James R Crawford, Wiight Field, Dayton, Ohio 

The Wednesday evening session was under the chairmanship of Professor 
George Beadle, California Institute of Technology, with the following program: 

Topic , Siaiistical Problems in Oenetical Studies in Chickens 

Paper ■ Rate of Oenelic Gain in Egg Production in Progeny-tested Flocks as a Funclion of 
the Interval between Generations 

Everett R. Dempster and I Michael Lerner, Umversity of California, Berkeley- 

On Thursday morning, June 19, there was a joint session with the Western 
Psychological Association Professor Helen Walker of Columbia Umversity was 
chairman The program was as follows: 

Topic Statistical Problems in Psychology. 

Papers Statistical Criteria of the Effectiveness of Selective Procedures 
R. F Jarrett, University of California, Berkeley 
Unsolved Statistical Problems Arising in Psychological Measurements 
Helen Walker, Columbia Univeisity 

Cost Utility Curves as a Means of Assessing Batteries of Tests. 

Joseph Berkson, Mayo Clinic 

Approaches to Univocal Factor Scores 

J. P. Guilford, University of Southern California. 

The afternoon session on June 19 was under the chairmanship of Professor 
Harald Crambr of Stockholm, Sweden, and offered the following program 

Topic: Theory of Slatistics and Us Applications lo Astronomy 
Papers Random Variables with Compai able Peakedness 
Z W Birnbaum, University of Washington. 

Distributions which Lead to Regressions Representable hy Polynomials. 

Evelyn Fix, University of Calif oima, Beikeley 
Optimum Tests of Composite Hypotheses with One Constraint 
Erich L Lehmann, University of California, Berkeley 
Eslimalion of a Dislnhulion Function by Confidence Limits. 

Frank J. Massey, Jr., University of California, Berkeley 
A Note on Sequential Confidence Sets , 

Charles Stem, Columbia University 
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Certmn Types of Slahsiical Problems rn Astronomy 
Robert, .T Trumpler, University of California, Berkeley 

Baste Concepts of the Theory of Statistics in Relation to Certain Problems of 
Astronomy. 

J. Neyman, University of California, Berkeley 

A Note on the Problem of Binaiy Stars 

Elizabeth L, Scott, University of California, Berkeley 

Explicit Solution of the Problem of Fitting a Straight Line when both Variables 
are Subject to Error for the Case of Unequal Weights (By title) 

Elizabeth L. Scott, Univeisity of California, Berkeley. 

Power Function of the Analysis of Variance and Covariance Test of a Normal 
Bivariate Population. (By title) 

Way Ming Chen, University of California, Berkeley 
Unbiased Estimates with Minimum Variance. (By title) 

Charles Stein, Columbia University. 

Sufficient Statistics and a System of Partial Differential Equations . (By title) 
Erich L Lehmann, University of California, Berkeley, and Henry Soheff^, 
University of California, Los Angeles. 

On Wednesday evening, June 18, at 6 o’clock, there was a dinner for members 
and guests, at the Hotel San Diego. 


P S. DWYER 
Secretary 
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ON OPTIMUM TESTS OF COMPOSITE HYPOTHESES WITH ONE 

CONSTRAINT! 

By E L. Lehmann 
University of California, Berkeley 

Summary. This paper is concerned with optimum tests of certain composite 
hypotheses. In section 2 various aspects of a theorem of Scheff4 concerning type 
Bi testa are discussed. It is pointed out that the theorem can be extended to 
cover uniformly most powerful tests against a one-sided set of alternatives. 
It is also shown that the method for determimng explicitly the optimum test 
region may in certain cases be reduced to a simple formal procedure. These 
results are used in section 3 to obtain optimum tests for the composite hypothesis 
specifying the value of the circular serial correlation coefficient in a normal 
distribution. A surprising feature of this example is the fact that for the simple 
hypothesis obtained by specifying values for the nuisance parameters no test 
with the corresponding optimum properties exists. 

In section 4 the totality of similar regions is obtained for a large class of prob- 
ability laws which admit a sufficient statistic. Some composite hypotheses 
concerning exponential and rectangular distributions are treated in section 5. 
It is proved that the likelihood ratio tests of these hypotheses have various op- 
timum properties. 

1. Introduction. In developing tests for a class of hypotheses three phases 
may be distinguished First, tests are obtained which are intuitively appealing; 
next, it is shown that these tests have certain attractive features, finally, it is 
proved that they are “best possible” tests 

In dealing with parametric hypotheses, the likelihood ratio principle is fre- 
quently used to obtain a reasonable test. For many of the tests so derived for 
normal and exponential distributions, the question of bias has been investigated. 
In most cases unbiasedness has been established; in the other cases, usually a test 
based on the same criterion but with the boundaries shifted, can be proved to be 
unbiased. Other desirable properties which likelihood ratio tests have been 
shown to possess, relate to the asymptotic behaviour of these tests as the sample 
sizes tend to infinity. An interesting problem which does not seem to have been 
treated is the question of admissibility of likelihood ratio tests, a test being ad- 
missible if its power can not be improved upon umformly by any other test of 
the same level of significance. 

Investigations of optimum tests of composite hypotheses have been carried 
through for many hypotheses concerning normal distributions When the hy- 
pothesis specifies the value of one parameter (hypothesis with one constraint) , 
uniformly most powerful one-sided and type Bi (uniformly most powerful un- 

I Presented at a meeting of the Institute of Mathematical Statistics m San Diego, June, 
1947 
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biased) tests have been obtained. When the number of constraints is larger 
than one, not so much can be expected. It has been shown for some of the tests 
in this class that they have maximum average power uniformly over a family of 
surfaces in the parameter space, or that they are uniformly most powerful with 
respect to the subclass of tests whose power depends only on some function of 
the parameters. (All optimum properties mentioned are relative only to the 
class of all similar regions. This will bo so throughout the paper and will usually 
not be stated explicitly). 

Two methods for finding uniformly most powerful or uniformly most powerful 
one-sided regions and type Bi tests, if they exist are known. Neyman and Pear- 
son [1] developed a method for determining all similar regions, and applied it 
to obtain uniformly most powerful one-sided tests of certain hypotheses. Ney- 
man [2, 3] extended the method to obtain, for certain hypotheses, the class of all 
bisimilar (unbiased similar) regions, and Scheffd [4], developing the method 
further, proved the existence of type Bi tests for an important class of hypotheses. 

A different method for obtaining all similar and bisimilar regions was devised 
by P. L. Hsu and was used by him and other writers to prove various optimum 
properties of the likelihood ratio tests for the general linear hypothesis, of Hotel- 
ling’s and of other tests [5, 6, 7, 8]. 

In the present paper we are concerned with applications of these two methods 
to composite hypotheses with one constraint. However, the applicability is not 
so restricted. In fact, the second method has been used mainly in connection 
with composite hypotheses with many constraints, and the author believes it to 
be suitable also for deriving optimum classification procedures. An essential 
restriction of both methods seems to be that a set of sufficient statistics must exist 
with respect to the parameters involved; with respect to the nuisance parameters 
so that all similar regions can be found, with respect to the parameters specified 
by the hypothesis so that there exists a best of all similar regions. 

Extensions of the existing theory based on the first method are obtained in 
section 2, and the theory is applied in section 3 to a hypothesis concerning a mul- 
tivariate normal distribution. Sections 4 and 6 are concerned with applications 
of the second method to problems to most of which the earlier method is not 
applicable, in particular to hypotheses concerning exponential and rectangular 
distributions, hitherto only treated from the likelihood ratio point of view. 

2. On the theory of optimum tests. 

2.1 One-sided teats. In an interesting paper [4], Scheffd determined the type 
B and type Bi tests of a certain class of composite hypotheses speoif5dng the 
value dt of a parameter 6 in the presence of nuisance parameters. 

Scheff4’s results can, in an obvious' way, be extended to cover one-sided sets 
of alternatives To show this, consider the method used in [4]. Under certain, 
assumptions all tests^ are found which satisfy the two conditions; 

® The terms "the test w" and "the region [of rejection] w ” will be used interchangeably. 
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(a) The power function at 6o has a preassigned value « (the level of signifi- 
cance), independent of the nuisance parameters; 

(b) the power function at 6o has derivative 0. (Condition of unbiasedness) . 
Then that test Wo is determined for which, of all those satisfying (a) and (b), 

(c) the second derivative at So , (3 m(9o), is as large as possible. 

By definition wo is a type B test Under a certain additional assumption (this 

. 

is the convexity assumption > 0 of Scheff^’s Theorem 2) it is shown that of all 

tests satisfying (a) and (b), Wa has maximum power against all alternatives, 
i.e. is of type Bi . 

If now we want to maximize the power against only the one-sided set of alter- 
natives, 9 > 00 , we determine that test wi of all those satisfying (a) , for which 

(d) the first derivative at 0o , pLida), is as large as possible. 

Under a certain additional assumption (in Scheff4’s notation this would be the 

. . s 

monotonicity assumption ~ > 0) it can then be shown that of all tests satisfy- 
ing (a), wi has maximum power against all alternatives 9 > So , (it also has 
minimum power against all alternatives 9 < So), i.e Wi is uniformly most power- 
ful against alternatives 9 > 9a . Wc shall not carry through the discussion 
in detail since Scheff^’s argument applies step by step, with only the obvious 
changes. 

2 2 Determination of the boundaries. Let Xi , • • , be n random variables 
with a joint probability density function p, depending on parameters 0i and 9 = 
(02 ) • • , 0i) We shall denote the probability density function of a set of ran- 
dom variables , • • • , whose distribution depends on a parameter 9 by 
‘p{x\ , ■ • • , aJn I 0) or simply by p(a:i , • • • , a;„) when the dependence on 0 is 
clear from the context. The set of points (ki j • • • , a:„) for which 


is positive we shall denote by W+(0). 
Let 

( 2 . 1 ) 


9) 


, a;„) = — log p(a:i, 


01 1 0) l^i— I (‘^ — 


, 1 ), 


and let the random variable 4>, be defined by 

(2.2) = ipiifKi,- • •, JCn). 

Then for testing the hypothesis H\ 9i = 9°, under the assumptions stated by 
Scheff6, the type Bi test wo is defined by the inequalities 

(2.3) n <ki, > h • {h < h) 

where fci , depend on 9i, 9, ipi , <pi and are determined by the two equa- 
tions® 

(2.4) / <p[p(<Pi, ■ ' d<pi = (1 - e) / same (s = 0, 1) 

Jki ' 


® Although, fci and kn may depend on fi, wo is independent of 0, as was shown m [4]. 
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The equations (2.3) and (2.4) are not suitable for the determination of the 
boundary of lUo . The variables have to be transformed so as to obtain for 
Wo an expression from which the calculation of the boundaries becomes feasible, 
(cf. [9]). This part of the work may be formalized in the following theorem. 
Theobem 1. Let 

U - /($i,4>2, ••• ,$/) 

(2.5) 

Vi = Oii^i , ,'hi), (i = 2, ••• , 1), 

he a system of functions, continuously differentiable and with non-vanishing Jaco- 
bian almost everywhere, and such that 

(i) U IS a linear function of 

(2.6) U = 0^1+ b 

with coefficients which may depend on $2 j • • * ,^i o^nd such that* a{^i , • • • , $;) > 0> 

(ii) It is possible to solve for^t, ■ • • , in terms of the V’s, 

(iii) under the hypothesis H, U is distributed independently of 

y = (72,---,yi). 


Then the region lOo is equivalent to the region 

(2.7) u < Cl, > Ci (ci < C 2 ) 
where Ci , C 2 are determined by 

fiCi I* * 

(2.8) J u‘p(u) du = (1 — t) J u'p(u) du (s = 0, 1). 


Proof. 


p(<pi> ^2! •• • ><pi) = p('Wi , •••,«») 


(29) 


d{u, Vi, ■ • ■ Vi) 

,<pi) 

= p(.u)-pivi, • •• > *'i) 


S(vi, 


a(v>2, 



Blit 

( 2 . 10 ) 


u = a(,(p2 , ■ ■ • , <pi) • <f>i + b{(pi ,•••,(?!) 


= a(vs , • ■ ■ ,vi) • ipi + Pi^i , • • • iVi) 


BO that (2.4) reduces to 

r‘""^'-(^yp(u)p(vo,--,vi)du 

= (! — «)/' same (s = 0, 1) 


( 2 . 11 ) 


‘ A similar theorem holds when we assume 0 (^ 3 , .... $ 1 ) < 0. 
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and hence to 

(2 12 ) 


I 


C2(u2i’ 


•,vi) 

tvO 


u‘ p(u) du 


(1 — e) f same 


(s = 0 , 1 ) 


which shows Cl and C 2 to be independent of the w’s. Also obviously (2 3) trans- 
forms into (2 7) which completes the proof. 

If U IS such that its distribution (when 0i = 0?) is independent of 0, Ci and cj 
of theorem 1 will depend only on the data of the problem- e, n, el. However, the 
existence of constants ci and ca satisfying (2 8) atUl has to be proved. We may 
show more generally the existence of hi and satisfying (2 4). A proof is im- 
mediately supplied by an argument which was used by Neyman [10] and Wald 
[11] to prove the existence of type A tests, and which may be stated in the 
following 

Lemma. Let 0 < a < 1, let fix) > 0 and / x‘f{x) dx < °o for s — 0,1. Then 

J— eo 


there exist A, B such that 


(2.13) 


[ x‘ fix) dx 
Ja 


OL / x’' f(x) dx 


is = 0, 1). 


3. Testing for circular serial correlation in a normal population. We now 
apply the results of the previous section to obtain the optimum tests (i.e uni- 
formly most powerful against the one-sided set of alternatives, type Bi in the 
two-sided case) for the hypothesis specifying the value of the circular serial cor- 
relation coefficient in the normal population considered by Dixon [12]. (For 
the literature on testing for non-circular serial correlation in normal populations 
cf. [12]). 

We assume 

(3 1) vi^i ) * ' * j ^ji) ~ ('\/2^o')” 2(7* ((^^ 5(^i+i £)] J 

where x„+i = Xi and | 5 | < 1, and we test the hypothesis S = 5o . For testing 
purposes only the value So = 0 is of interest presumably, however, the family of 
tests for arbitrary 3o is required for estimating 5 by means of confidence intervals, 
and therefore the more general hypothesis is considered. 

Maldng a transformation in one of the parameters we write 

p(a:i , • ■ • , a^n) 

= 0(5, a) exp[^ a [(1 + 5“) Z i^i “ 0* “ E (a:. - «)(a:.+i - t)]] 

where in the notation of the previous section Bi = &, Bi = a, 6% = J. 

Theobem 2. For testing the hypothesis 8 = da for the distribution (3.2) 

(a) the type Bi test exists and is given by 

(3.3) 


r < ri, > ri 
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where 

V 

S (a:, — x)(a:i+i — x) 

(3,4) r = 

£ (x, — xf 

1-1 

and where Ti and rt are determined by 

(“> (l +■ si’'- 2 J.r) * - (1 - •> L 

(b) the uniformly most powerful similar region for testing H against the alter- 
natives S > So exists and is given by 

(3.6) r>r' 

where r' is detemimed by 


1 ” 

® = - 2 Si 


n ,=1 


(3.7) 


I p(r) dr = (1 — f) [ p(r) drf 

J— go CO 


Peoop. We compute 

ipi = ^ 1(50 j a) + 2a:[5oS(x< — f)* — 21(xi — f)(a:,+i — f)] 

(3.8) ^2 = C 2 (So ,«) + (! + S!)2(xi - - 2SoS(x, - ()(x,+i - £) 

^8 = — 2na(l — 60) (S ~ ^)- 

There is no difficulty in checking the conditions of Scheff6’s theorems [4]. 

Next we apply Theorem 1 of the previous section, and define 

Vo = (1 + 5S)S(Z. - ly - 25oS(X. - X)(Xi+x - X) 

(3.9) 7, = X - ? 

S(X.- - X)(Zh. - X) 

Vo 

Conditions (i) and (ii) of Theorem 1 are easily seen to be satisfied. To show that 
U is independent of 7 = (72 , 73 ) we employ arguments which have recently 
been used by various authors in a number of similar problems (of. [13, 14, 15]). 

It is seen that an orthonormal transformation exists: 


such that 


(3.10) 


Xi , . ■ • , ^ 7i , 


7« 


VnX = 7i 

i: (X. - XXX,+t-X) = 'E\Yl 

| bb 1 1—2 

i: (X ~xy = z Yi 

t-2 


* A, corresponding result holds for the other one-sided case. 
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Under H the F’s are distributed with probability density 

(3.11) p(yi, ■ ■ ■ ,y„) = ( 7 ( 80 , a) exp + 2 Mi2/i 

where k, 112 , • • • , depend on So and where the y’s are all positive. Introducing 
new variables 



(3.12) Zt — v^F, , {i = 2, ■ • • , n), 

and, then, generalized polar coordinates in the space of the Z’a, 

(3.13) ^ = 4 /s F? > ’*^1. ■ * • , 

r t=2 

we see that Fi , i2 and , • • , are completely independent. Also 

while U, being homogeneous of degree 0 in the Z’s, is a function of the ^^’s only. 
This proves that U, V 2 and Fa are completely independent. The type Bi test 
of H is therefore given by 

yi 

2 - «)(a:,+i - x) 

(3.14) U = ;; ;; < Cl , > Ci 

(1 + 5o) 2 (a:. - xf - 25o £ (a:. - 5)(a:,+i - x) 

tal teal 

where Ci and Cs are determined by 

(3.15) f u°p(u) du = (1 — e) f u’ pu{) du (s = 0, 1) 

We still have to show that this test is equivalent to the one defined by (3.3) 
and (3.6). For So = 0 this is trivial. Let us assume So < 0. (The other 
case goes through similarly.) The inequality u < Ci is equivalent to 

(3.16) (1 + 25oCi)S(a:. - ®)(x,+i - S) < (1 + S?)S(a:. - xf 


and hence to 


(3 17) 

provided 1 + 2ciSo > 
(3.18) P {17 < Cl) 


2 (a;,' - x)(a;.+i - x) 

2(x. - xy ^ ' 

1 

0. Suppose 1 + 2ci5o < 0, i.e. Ci > 

> p|c7 < = P{0 < S(X. - Xf} 


• We denote the probability of an event A by P 



Then® 
= 1 
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i.e. P{U < Cl] =1 winch would contradict (3.15). Similarly if 1 2c2So < 0 
wc would have P{V > C 2 } =0 and hence our test would be one-sided and there- 
fore not unbiased, The inequalities u < ci , > C 2 are thus equivalent to the 
inequalities r < ri , > and since 

r 

^ 1 + - 2«or’ 

(3.5) also follows. 

The existence of type Bi and uniformly most powerful one-sided tests of the 
hypothesis H is rather surprising. For when a and ^ are assumed known, neither 
the type Ai test nor the uniformly most powerful one-sided test of the simple 
hypothesis IP: S = So exists. This is easily seen by determining the most 
poiverful and the most powerful unbiased test against a specific alternative 
for the hypothesis H' in the population 

1—5“ 

(3.19) p{xi, . x„] = [~H(1 + — 25Sa:,a:.+i]]. 

The distribution of the criterion R was obtained by R. L. Anderson [16] (see 
also [17]) for the case 5 = 0. Madow [15] using Anderson’s result found the dis- 
tribution for arbitrary 5 (Approximations to the distribution have been studied 
by various authors, for the literature on this cf. [18]. Recently Hsu [19] ob- 
tained an asymptotic expansion.) A direct derivation for arbitrary 6 may be 
based on the following theorem of Cram6r, which was communicated to the 
author by Dr. P. L. Hsu. 

Theokem 3. (Cramiry, If X, Y are two random variables, {not necessarily 
independent), F > 0, (hen 


where <px and </' are Ihe characteristic functions of X — xY and Y respectively, 
provided 


(3 21) 



v>i(t) — ^(t) 
t 


dt < n . 


Theorem 4, If 


(3.22) 




(Xn+, - Xl) 


’ Differentiated forma of the theorem were given by R. C. Geary [Jour Roy. Slat. Soc. 
Vol 107 (1944) p. 66] and H. Cram4r [Exercise 6 on p. 317 of Mathematical Methods of 
Statistics. Princeton Univ. Preas (1946)]. 
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and if 


(3.23) 


then 


R = 


£ (Z. - X)(X,+, - X) 
111 

£ (X, - X“) 

t-1 


PIR > r} 

(3 24) 


2^+112 1 — S " , 

n (1 - 5)(1 + 52 _ 2dr) 



n n 

1 + 5= - 25\f 


where the summation is extended over all integer j,\ < j < 
where 


n 

2 ’ 


for whicf/, Xj > r, and 


(3 25) X, = 2 cos ^ 

n 

The proof of this theorem from Theorem 3 is straightforward and only will 
be indicated here. If X and F denote the numerator and denominator of R 
respectively, the characteristic functions of Y and X — rY may be obtained by 
the method of oirculants (cf . [12, 17]) . The integral on the right hand side of 
(3.20) is then easily evaluated by the theory of residues when n is odd. In the 
case that n is even, the integrand has two branchpoints, one in the lower and one 
in the upper half plane. These may be separated, and then again the method 
of residues may be applied 


4. Similar regions. The problem of findmg all regions similar to the sample 
space with respect to a parameter 6 was solved by Neyman and Pearson [1] for 
a certain class of probability laws. In a later paper Neyman proved ([20] 
proposition IX) that if there exists a sufficient statistic T for a parameter 6, 
then w IS similar with respect to 6 if it has the following structure: For the inter- 
section w(t) of w with the surface T = t, the relative probability of w(t) given 
T = t has a constant value independent of t. We shall show in this section that 
for a large class of probability laws which admit a sufficient statistic for 6 the 
regions with the above structure are the only ones that are similar with respect 
to 9. 

We consider samples from a univariate distribution and we distinguish three 
cases as one, both or neither of the extremes of the range of the distribution 
depend on the parameter 6. For the first of these cases (cf Pitman [21]) we con- 
sider samples from a distribution with probability density 
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(4.1) p(x) = ^ , k(6)<X<C, 

where k(d) is a strictly monotone continuous function of 6 and where c may be 
infinite. Introducing a new parameter 5 = fc(6) the distribution of a sample 
from (4.1) is given by 

(42) 5 < ^. < c. 

To obtain the totality of regions w similar with respect to 5 let us denote by 
IFi , • • ■ , Wn the portions of the sample space where the smallest of the re’s is 
rri , • ■ ■ , a:„ respectively, For any region w denote by Wk the intersection of w 
with Wh ■ Consider a transformation carrying , • • • , Wn into Wi , letting 
2/1 = min(a:i , • • • , rc„) and letting in Wk ■ 


(4.3) yi = xi , yi = Zi , • • ■ , Uk = rc^-i , 2/i+i = Xk+i , 

Denote by*w*j the image of wi under this transformation. 
w be similar with respect to S, 


(4 4) 


I 


f(Xn) 

b(S} 


dxi ■ 


w 

dxn = e, 


‘ * ) 2/« • 

The condition that 


may be written in the form 
fiVi) 

J6 1)(5) 1^1-1 JwHi(vi) 

(4.6) 


f jz f Kv^) • • • fiVn) dl/2 • • • dyS dyi 

oC5) U«i ) 


= nt f j f /(i/a) • • • f(y„) dy 2 --- dj/„| dy^ 
h 6(5) ) 


where W(j/i) denotes the region yi < yi < c, {i = 2, • • • , n), that is, the region 
of variation of j /2 , ■ ■ • , given yi , and where Wkiyi) denotes the region of vari- 
ation of 2 / 2 , • • • , 1/n given j/i and Wk . From (4 5) we obtain 

1 r« (») 

(4,6) ^ fiydHvi) dyi = 0 

where 


6(6) Js ■ 

ft 

'PiVi) = Z I fiVi) • ’ • fiVn) dyi-” dijn 

/ cp »1 ’fwkivi) 

pe ro 

-ne f{y->) • ■ ■ f{yu) dy^--- dyn ■ 

''Vi 


(4.7) 

But (4.6) implies 


(4,8) tpiyi) = 0 almost everywhere 

and since we can only determine w up to a set of measure 0, we may omit the 
qualification in (4.8). Therefore a necessary and suflficient condition for w to 
be similar is 
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(4.9) 



n . 

^ / 
fc—l ‘'iDJfcCVl) 


/(ys) • ■ • /(yn) dyi - dy„ 


e 


for all yi , 

To see more clearly the structure of these regions, let us take n = 2, Equa- 
tion (4.9) states that on each of the broken lines of Fig. 1 the relative probability 
oiw = w'l + W 2 given Yi = yi is e, where the decomposition of this probability 
into its two components may vary with yi . 



i:V,=y. 


In general equation (4.9) states that on each hyperplane Yi = y^ the relative 
probability of w is independent of yi . Since Yi = min {Xi , ■ ■ ■ , X„) is a suflB- 
cient statistic for d, Neyman’s theorem in this case does give all similar regions. 

Next let us consider the case where both extremes of the range of the distribu- 
tion depend on the parameter. We shall assume (cf. 121]) that Xi , ■ ■ ■ , X„ 
are distributed with probability density 

(4.10) p{x) = -^ in 9 < X < b{e) 

aw 

where b is a strictly decreasing continuous function over an interval [— “i 
b(— oo)] and where b[b(— <»)] = — ». These assumptions insure that there 
exists a unique number a, — « <a<b(— oj), such that b(a) = a. 

Denote hy Wn , (i, j = 1, • • ■ , n-, i j), the portion of the sample space 
where the smallest and the largest of the a;'s are Xi and Xj respectively. Denote 
by W.ji and those portions of W,j where x, is greater than and less than 
h~^{xj) respectively. For any region w denote by wt,ic the intersection of w viLh 
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Wtik ■ Consider a transformation carrying the sample-space into Wm , letting 
yi = mm (xi , ■ • • , Xn), 2/n = max (a;i , • • • , x„) and in W,j letting 2 / 2 , • • • , ?/„_i 
denote the remaining x’s in the order of their subscripts. Next make a trans- 
formation carrying Wm into Wmi , letting = max [y-i , h ^(yj], Sn = min 
[ 3 / 1 , W^j/n)] and at = 2 /jb for /c = 2, ■ • • , n — 1. Denote by w.jk the image of 
w'.jk in Wmi ■ 

Then is a sufficient statistic for 0 (cf. [21]) and there exist functions /i , gi 
such that the density of is given by 

(4 11) pfe) = in 0 < a 


while the distribution of the remaining Z’s given Z„ is independent of 6. 

The condition that w be a similar region may now be written, analogously to 
(4.5), in the form 


(4.12) 


/’■%] S / 




p(z 


1 ) 


, z„_i I Zn) dzi--‘ dz^-i dZn = e 

J Q 


Vlfen) 

gi{s) 


dZn 


and hence by the argument which led to (4.6), as 


(4.13) p(«i , • • • , 2n-i I fcn) dzi ■■■ dz„-i = « for all z„ . 

i,],h 

Thus in this case also Neyman’s theorem gives the most general similar region. 

Por the case that neither extreme of the range of the distribution depends on 
the parameter 0, it has been shown by various authors [22, 21, 23] under slightly 
varying assumptions concerning the regularity of the distribution function, that 
the existence of a sufficient statistic implies 

(4.14) p(a: 1 0 ) = exp [P(0) + T{x)Q{_d) + P(a;)]. 

This (cf . [10]) is a special case of that for which Neyman and Pearson determined 
the totality of similar regions, however under the restriction that the moments 

of $ = _ log p(Z,) uniquely determine the distribution of We shall 

briefly indicate how this assumption may be avoided. 

Let Xi , • ■ ■ , Xn be a sample from (4.14), or, more generally, (this is the case 
considered by Neyman and Pearson), let Xi , • • ■ , Xn be distributed with prob- 
ability density 

pixi , • ■ • , a:„) 

(4.16) 

= exp [P(0) -h u{xx , • • • , x„)Q{B) -f v{xt , ■ • • , «:»)] 

in a sample space which is independent of B. Wo shall assume that the set 
of values which Q takes on contains at least some interval. Introducing S = 
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— Q(0) as a new parameter, we shall obtain all regions similar to 5 (where the 
set of values of 5 contains an interval) for the distribution* 


(4.16) 


) 31n) 


= exp [pi(S) 


5 ' u(x^ 7 * * * J ^n) “h P(^l j * * * j ^n)] 


under the assumption that 



5^ 0 except possibly on a set of measure 0. 


Let us for a moment assume that there exist functions /.(a:, , ■ • ■ , x„), 
(j = 2, • • • , n), with continuous partial derivatives almost everywhere and such 
that the transformation 


(4.17) 1/1 = u(xi , ■■■ ,Xh); 1/. = f,(xi , ••• , Xn), (i = 2, ■ • ■ , n), 

is one to one on 17+ except possibly on a subset of measure 0. Applying this 
transformation we may write the condition of similarity in the form 

gPiW-syi f dy^-dyi 

= e f f f(yi, , Vr.) dvi - ■■ dyn-dyi 

J— 00 


L 

(4.18) 


where W{y^ denotes the region of variation of 2 / 2 , • • • , 2 /„ given y \ , and where 
denotes the region of variation of 2 / 2 , • • • , 2/n given 2/1 and w. Furthermore 
/(2/1 ) ■ • ■ 1 2/n) IS independent of 5. From the theory of bilateral Laplace trans- 
forms it is known that (4 18) Implies that 


(4.19) / /( 2 / 1 , • • • , 2 /t.) d 2/2 • • • d 2 /n = 6 / JiVi, , y„) dy^ ■ • ■ dy„ 

Jw(vO 

which is the desired result 

More generally it may be shown that our assumption concerning u(xi , ■ ■ • ,x„) 
insures the existence of functions/, , (^ = 2, ■ • ■ , n), such that under the trans- 
formation (4 17) no point (2/1 , • • • » 2 /n) has more than a denumerable infinity of 
counter images in x-space. Our proof can be modified to cover this case. The 
argument is similar to that used to obtain equations (4 9) and (4.13) which were 
also arrived at through many to one transformations. 


S. Testing exponential and rectangular distributions. In their fundamental 
1928 paper [24] on likelihood ratio tests, Neyman and Pearson discussed various 
hypotheses relating to normal, exponential and rectangular distributions. Later 
they and other authors developed a theory of similar and bisimilar regions which 
made it possible to obtain optimum tests of many composite hypotheses with 

° An afc- 5 iimptioii that wo can solve for 0 as a function of 5 is not needed since we can 
determine Pi (S) by inLCfriatiiig the donsiti' (4.16) over W^.. 
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one constraint concerning normal populations. This theory however is not 
applicable to most hypotheses concerning exponential or rectangular distribu- 
tions. We shall in this section obtain optimum tests of some hypotheses relating 
to those latter distributions, using the method of the previous section. 

Let us first consider a sample Xi , • • • , Xn from an exponential population, 
the probability density of the sample being. 

(i = l, 

and let us consider the two hypotheses Hi'. a =b , H^'.b = bo where, without loss 
of generality, we shall take ao = 1, Z)o = 0. The likelihood ratio teats of both 
these hypotheses were shown to be completely unbiased by Paulson [26]. We 
shall prove 

Theorem 5. The hkelihood ratio tests of Hi and Hi are type Bi and uniformly 
most powerful, respectively. The one-sided tests based on the likelihood ratio criterion 
for Hi are the uniformly most powerful one-sided similar regions for testing this 
hypothesis. 

Proof. In order to simplify the argument we shall give a detailed proof only 
for the restricted class of testa which are symmetric in the variables Xi , • ■ , Z„ . 

For testing Hi let us make the following transformation introduced by 
Sukhatrae [26]: 


(6.1) p(a:i, ••• ,.r„) = -exp 


(5.2) 


Zi = nYi 


Z,=^{n-i-\- 1)(F, - 


(i = 2, • • . , n). 


where F,. is the fth of the X’s in order of magnitude. Then 



if Si > n6; z, > 0 


(f = 2, • , n). 


We want to determine all regions w which under H are similar to the sample 
space with respect to b, i.e. all regions w satisfying 


f exp — £ Zi dz 2 ■ tkn dsi 


(5.4) 


Si 1 dz^ ' ' 

■ • dznf dsi 


(b) (&) 

^ e = € / cfei 

v nb 

1-2 J 

> . 


where w(zi) denotes the intersection of w with the hyperplane Zi = Zi. Now 
(5.4) is equivalent to 
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(5.6) 

where 

(5.6) 


/•» (b) 

g"! / e 'V(si) = 0 

•'r»& 

/(Sl) = / exp — 22 2t &2 • • • — 

L »— 2 J 


and this in turn is equivalent to 

(5.7) f{zi) = 0 for all Zi . 

Of all the regions w satisfying (5.7) we want to determine the one which against 


a specific alternative, say ai , has maximu m power, i.e. for which 

(5 8) f f expT— - 22 z.l dz2 • ■ • 

•^nb L *— 2 J 

is as large as possible. We thus see that to will have the desired properties if 
w{zx) is determined according to the two conditions 


dzi 


(5 9) 

and 

(5.10) 


/ exp — 22 z» dza • • • dz„ = 
■'“(•l) L *—2 J 

/ exp — - Z) z, dza • • • dz„ = 
•'wUl) L *—2 J 


(512) 


Hence by the Neyman-Pearson fundamental lemma iu(«i) is the set of points 
satisfying 

(5.11) exp > C(ai, zi) 

and therefore according as oi is greater or less than 1, ia(zj) is determined by 

T» T? 

22z< = 22k. - min (a:i, ••• , x„)] > k{ai, zi), or 

1-2 fl 

n n 

22 z. = 22k, — min (xi, ••• , x„)] < k'(ai, zi). 

|Bi2 

T1 

But 2 is independently distributed of Zi and under H the distribution of 

n 

22 does not depend on ai , in fact it is a chi-square distribution with 2n — 2 

t~2 

degrees of freedom. Thus k and h' , as determined by (5.9) are independent of 
fli and the two tests (5.12) are uniformly most powerful one-sided. 

Next we consider Ibc more restricted class of unbiased similar regions. For la 
to be unbiased we must have 
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^ {3^/. “p [ - “p [ - s§ ■ *• 

(5.13) = / (si - nb - n) exp [— (si - nb)] / exp - ^2. *2 • • dz, 
•^nb »'u>(«i) L t— 2 J 

+ f exp [— (zx — nb)] f (j2 0.) exp j — dss ■ ■ • dz„ dzi = 0 

Jnt •^'‘>(* 1 ) \j—2 / L «“2 J 

The first of the integrals in the middle member equals 

/ (z — n) e~‘ / exp — Zi \ dzi ■ ■ • dZn dz 
•id L t“2 J 

= e / (3 — n) dg = — (n — l)e. 

Jo 


(6.14) 


Therefore 

(5.15) 

or 

(5.16) 
where 

(5.17) 


r [ ( E exp r - E 3.1 dz2-- - dzn dzx 

\i— 2 / L »“>2 J 

= (n - 1)€ = (n - 1)€ f dg^ 

*'«6 

l*vo (b) 

/ e“‘‘ (/(^i) dzi = 0 

Jn6 

fir(si) = f exp [ - Es. 1 dzt ■■ ■ dZn - (n - l)e. 


Thus finally the condition of unbiasedness reduces to 


dzi • • ■ dzn = (n — l)e 


(5.18) [ (E^.) exp [- Esil 

JmCJi) \i-2 / L 1-2 J 

and we seek the region w(zi) which satisfies (5.9), (5,10) and (5.18). 

By the fundamental lemma w(zi) is given by 

(6.19) exp ^ <71(^1, Si)X«f + Ci(ai, si)J ■ exp j^- E^jJ 

which is equivalent to 

71 

(6.20) E < fci(oi , ?i), > h(ai , zi) 


where ki and fcj are determined by (5.9) and (5.18), and are therefore independent 
of zi and a. Thus the region (5.20) which of all unbiased similar regions, 
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maximizes the power against the alternative a = Oi is independent of ai and 
hence is a region of type Bi . This completes the proof since it is easily verified 
that (5.10) IS equivalent to the likelihood ratio test. 

The proof for regions which are not necessarily symmetric in the variables 
follows similarly if instead of the transformation (5 2) one uses a transformation 

n 

U, = /,(Xi , ■ • • , Z„) which is one to one and such that Ui = Zi ,U 2 = . 

The distribution of C/s , Un is then mdependent of a and b since Vi , Ui 
are a pair of sufficient statistics for these parameters, and the proof carries over 
step by step. 

Next we consider the hypothesis — 0, and again we restrict ourselves to 
regions which are symmetric m the variables, although as before the proof can 
be modified to cover also nonsymmetric regions. 

We first make the transformation to Zi , ■ ■ , Zn given by (5.2). In the 
71 — 1 dimensional space of Z^ , ■ ■ ■ , , we then transform to new variables 

n 

U, 'I'l , • ■ • , T „_2 where U = ^ Z, and where the T’s are the generalized polar 

1=2 

angles Obviously the distribution of the '®'’s does not depend on a, since they 
are homogeneous of degree 0 in the Z’s Furthermore the ^^’s are independently 
distributed of U since the probability density of the Z’b is constant over the 
hyperplanes U => u. Thus 


piZl, U, ■ , lAn-s) = 


(5 21) 


exp e 


We next introduce new variables 


7 = Zi + C/ and T = 




Z^ + U 


(5 22) 
and find 

P(V, t, \l/u ■ • , [“ • r 

(5.23) 

for V > nb, ~ < t < 1. 

~ V 

For w under Hi to be similar with respect to a, we must have 

f ^ exp r — -1 [ (1 — 0" ^ ■ 1 4'n-i) dt dipi • ■ • d^n-i • do 

Jo ^ L CtJ *'tuo(v) 

(6.24) 
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where wiv) designates the intersection of w with the hyperplane V = v, and 
where Wo(v) denotes the part of id(v) lying between the hyperplanes t = 0 
and t = 1. 

Hence the condition of similarity may be written as 


(6.25) 

where 



f{v) dv 


0 


for all a > 0 


(6 26) Kv) = f (1 - • • • ^n~i) dt #1 • • ■ - 6 . 


By the uniqueness theorem for Laplace transforms, (5.25) implies f{v) = 0 
for all « > 0, so that the condition of similarity finally reduces to 

(6 27) f (1 — t)"~^ p(i/'i, • • • , dl d^i ■ • ■ #„_2 = e. 


Of all similar regions, let us find the one which has maximum power. Obvi- 
ously we want to include in w(v) all points for which t < 0. In addition we want 
to choose Woiv) such that 


(5.28) 


f (1 - , fn-i) dl d\pi‘ “ #„_2 = 


max 


where Whiv) is that part of u)(v) in which max ^0 , — ^ < t. 

If, for some alternative b, Wo(v) is contained in ^ i < 1, then Wb{v) and 
iro(a) coincide and hence (5.28) attains its maximum value e whatever the posi- 
tion of Wo(v) in — < i < 1. If on the other hand — is so close to 1 that 

V V 

Tih 

— ^ ^ 1 is too small to contain Wa{v), then (5.28) attains its maximum for 

any ii;o(i') containing — < i < 1. There exists therefore a rmique Wo(v) which 
maximizes (5.28) for all values of b and v, namely the region defined by 
(5.29) C(v) <t<l 

where C is determined by (6.27). 

Since under Z/ 2 , the statistics V and T are independent, C does not depend 
on V. The test 


(6.30) t < 0, >C 

which we have just shown to be uniformly most powerful, is also the likelihood 
ratio test which completes the proof of the theorem. 

We shall finally consider an example of an optimum test in coimection with a 
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rectangular distribution Let Xi , • ■ - , X„ be independently and umformly 
distributed over (a, a + d), where 8 is positive. For testing the hypothesis 
H: a = Oo , the test 

(531) v' < 0, >C 

where Yi and F„ are the smallest and the largest of the X’s respectively, is the um- 
formly most powerful of all similar regions. 

The proof of this goes through very much like that for Hi in Theorem 5. 
Without loss of generality we take a^ = 0. Also again, to simplify the proof, 
we restrict ourselves to regions which are symmetric in the variables. We need 
the following lemma. 

Lemma. Let X\ , • • • , X„ be independently and uniformly distributed over 
(a, a + 0) . Let Y, denote the tth X in order of magnitude, and let 

(5 32) T„ = Y,„Tk = i). 

Then for a > 0 

(5 33) ‘ p(ii, ■■■ ,tn) = ■■■ ti 

when 

a < «„ < a + 0, — — — - — - — < 4 < 1, (/c = 1, • ■ ■ , n - 1). 

tn * In— 1 * * • IA.+1 

This is easily seen by applying the usual method of Jacobians. The inequali- 
ties describing the sample space of the T’s are equivalent to the following more 
convenient ones: 

(5.34) a < 4 < o + 0. 7 < 44 • • • 4-i < 1, 4 < 1, {k = 1, ■ ■ ■ ,n - 1). 

Let us denote by w{t„) the intersection of a region w with the hyperplane 
T„ = tn , and by Wa(.t„) that part of w(4) contamed in the cylinder 0 < 4 < 1, 
(A: = 1, -, n — 1); then we find as a necessary and sufficient condition for 
w to be similar with respect to 0 (assuming H) 

(5.35) (n - 1)1 [ CzlCZl ’ • • 4 din-i • ■ • d4 = 

Of all regions satisfying (5.35) we want to find the most powerful one. Let 
us first consider alternatives a > 0. If Wo(4) denotes the common part of wo(4) 
and the region 

a 

(5.36) 7 < h-iin-i • • ’ 4 < 1, 

In 

we must choose Wn(tn) such that 
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(5 37) / ijz? • U dt„-i ■■■ dk = max. 

From this it folloAvs easily that against alternatives a > 0 the uniformly best 
choice for wo(i„) is 

(5.38) ilk--- 4-1 = - > C'(tJ, 

t/n 

and since under //, is independently distributed of T„ , C'(t„) docs not depend 

n 

on 4 . 

Consider next alternatives a < 0. We include in the region of rejection all 
points for which Yi < 0. To determine Wo(/.„) we notice that, given Fi > 0, 
the Z’s are uniformly distributed between 0 and a + 0. (Provided a + 0 > 0; 
the case a + 0 < 0 is trivial) . Hence the probability distribution of the T’s 
given Fi > 0 is 

p(4..-.,tj7i>0) = 

0 < t„ < a + 0, 0 < 4 < 1 for /f = 1, • • • , n — 1. 

p(ti , - , 4-1 1 4 , a < 0, Fi > 0) 

p(ti , . . . , <„-i I 4 , a = 0) 

is independent of k ', - - • , 4-i and hence the power of w against alternatives 
a < 0 is independent of the choice of Wo(4) • Therefore the region 

(541) j/i<0, ^>C' 

Vn 

is umformly most powerful against all alternatives. But (5.41) is equivalent to 

(6.42) < 0, > C. 

Vn- yi 

It is interesting to compare this result with that for the corresponding simple 
hypothesis. Let H' be the hypothesis: a = 0 when the X’s are assumed inde- 
pendently and uniformly distributed over (o, a + 1). There exists no uniformly 
most powerful test of H'-, instead the two uniformly most powerful one-sided 
tests exist. By analogy with the normal case one might then expect for H' 
that of all tests with symmetric power-functions, there be a uniformly most 
powerful one. This however is not so: there exist infinitely many admissible 
tests with symmetric powerfunction. 


(5.39) 
when 

Thus 

(5.40) 
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In this and the previous section we restricted ourselves to problems involving 
only one nuisance parameter. However, the method applies also to problems 
involving several nuisance parameters. 

In the usual way (cf [20, 9]) the results of this section may be translated to 
give optimum sets of confidence intervals for estimating the parameters in ques- 
tion In this connection it is an open question whether the confidence regions 
based on the type Bi tests discussed m section 2 will always he intervals; one 
would expect this to be the case 

The author wishes to acknowledge his indebtedness to Piofessor P L. Hsu 
for many helpful suggestions. 

Added in proof - In a joint paper by Professor Henry Scheff6 and the present 
author which has been submitted to the Proceedings of the National Academy of 
Sciences, a result is given concerning the existence of certain 1 ; 1 transformations. 
This result bears on Section 4 of the present paper where a question arises con- 
cerning the existence of a 1:1 transformation The existence of such a trans- 
formation is now assured and, as a consequence, the last paragraph of Section 4 
has become superfluous 
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A CORNER TEST FOR ASSOCIATION 

By Paul S. Olmstbad and John W. Tukey 
Bell Telephone Laboratories and Princeton University 

1. Summary. This paper proposes a new test (the “quadrant sum”) for 
the association of two continuous variables. Its notable pioperties are; 

(1) Special weight is given to extreme values of the variables. 

(2) Computation is very easy. 

(3) The test is non-parametric. 

Significance levels (for the quadrant sum) are given to the accuracy needed for 
practical use. To this accuracy they are mdependent of sample size (see Fig. 1). 
The generating function of the quadrant sum is given for the null hypothesis 
(no association = independence) . A limiting distribution is deduced and com- 
pared with the cases 2ra = 4, 6, 8, 10, and 14 Extension to higher dimensions, 
and application to serial correlation are discussed 

2. Description of test (even number in sample). We shall desciibe the' 
test as though a scatter diagram had already been drawn. The possibilities of 
direct computation from tabular data are indicated by the examples in sections 
8 and 9. 

In the scatter diagram, draw the two lines, x = x„ ^ y = ym , where a:,„ is the 
median of the a:- values without regard to the values of y, and ym is the median 
of the 2/- values without regard to the values of x. Think of the four quadrants 
or corners thus formed as being labelled -f) m order, so that the upper 

right and lower left quadrants are positive Beginning at the right hand side 
of the diagram, count in (in order of abscissae) along the observations until 
forced to cross the horizontal median. Write down the number of observations 
met before this crossing, attaching the sign -|- if they lay in the -f quadrant, 
and the sign — if they lay in the — quadrant. Repeat this process moving 
up from below, moving to the right from the left, and moving down from above. 
The quadrant sum is the algebraic sum of the four terms thus written down 
This process is illustrated in Fig. 2, where the black dots represent contributions 
to the sum, and the dotted lines, crossings. 

When there are an even number of pairs (x, y) and no ties, the medians will 
pass between the points. In this, the simplest case, the distribution of the 
quadrant sum is known for the hypothesis of no association (that is, of inde- 
pendence), and significance levels are given in Table 1 for the magnitude (abso- 
lute value) of the sum. It will be noticed that the sample size does not enter in 
any important way. 

The cases of an odd number of observations and of ties are discussed m the 
next two sections Simple devices make the test usable m most cases. A very 
great tendency toward ties, however, 'Will make it inapplicable. This will be 
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unimpoitant in most applications because of the fact that attention is being; 

J’ J.-_1 J. _ J 7 _ . ® 


QUADRANT SUM 
|S|= 16 1/2 
P ^ 0.5 7o 



Fie, 2. Scatter diagram of 116 pairs of observations 


The set of data which prompted the development of the test is shown in Fig. 2. 
The accompanying report described it as follows: "The various points appear 
to be scattered almost completely at random and give little indication of corre- 
lation.” The quadrant sum is 16 }^ which is significant at the 0.5% point. 
Intuitively, the significant association of the peripheral points is clear 


airectea to tne periphery. 


INDIVIDUAL TERMS 
TOP = +3 
RIGHT =+l 
BOTTOM s+6 
LEFT = + 6 1/2 
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3. Description of test (odd number in sample). If the sample size is odd, 
then we may usually follow the process outlined above. We will have difficulty 
only when the counting process meets a point, one of whose coordinates is a 
median In this case we employ a simple device, namely: 

Given a sample of 2n + 1 paii's, let x* and y* be the medians of the a-values 
and of the y-values, respectively. Let the pairs in wl^ioh they occur be {x*, y{) 
and (t,,, , y*), respectively. Replace these two pairs by the single pair (a;„ , y^. 
There are now 2r„ pairs and the regular method can be applied. 

The quadrant sum so obtained from an uuassociated population has the same 
distribution as that formed directly from 2n pairs. 

4. Description of test (treatment of ties). The behavior of the test is Imown 
when (1) there is no association, (2) the probability of a tie in .r- values or i/- values 


TABLE 1 

Working, significance levels for magmiudes of quadrant sums 


Significance level (Conservative) 

Magnitude of quadrant aum* 

10% 

9 

5% 

11 

2% 

13' 

1% 

14-15 

0.6% 

15-17 

0.2% 

17-19 

0.1% 

18-21 


* The smaller magnitude applies for large sample size, the larger magnitude 
for small sample size. Magnitudes equal to or greater than twice the sample 
size less six should not be u.sed. 


is zero. The following approximation, which has an unknown effect on the 
distribution, is suggested when ties are present: 

When a tied group is reached, count the number in the tied group favorable 
to continuing and the number unfavorable. Treat the tied group as if the 
number of its points preceding the crossing of the median were 

number favorable 

^ 1 + number unfavorable' 

It soems likely that this approximation is conservative. 

6. Discussion. When a moderate number, say 25 to 200, of paired observa- 
tions on two quantities are plotted as a scatter diagram, visual examination 
frequently detects what seems to be definite evidence of association between 
the variables. Often m such cases, the usual methods for measuring associa- 
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tion do not find statistical significance of association. Visual judgment, par- 
ticularly by engineers or scientists who may wish to take action on the basis 
of their findings, gives greater weight to observations near the periphery of the 
seatter diagram. This is not always desirable — ^but often it is very desirable. 

A quantitative test of association with such concentration on the periphery has 
been lacking The quadrant sum test was developed to fill the gap. Its fea- 
tures of speed and non-parametricity are useful but secondary from this point 
of view. 

When uniform attention to the whole scatter diagram is desired, the quadrant 
sum test IS of unknown usefulness We know little pnough of the operating 
characteristics of the more conventional tests, such as; 

1. The product moment correlation coefficient 

2. The four-fold table formed by the medians 

3. The biserial correlation coefficient 

4. The rank correlation coefficient 

and less about the operating characteristics of the present test. In this case, 
the quadrant sum test can only be recommended definitely for exploratory 
mvestigations of large amounts of data. 

There are many situations, however, where we do not know where to concen- 
trate our attention, and where speed and non-parametricity are cardinal virtues 
in a test. One example is the use of serial correlation in studying industrial 
processes. We may guess that here we are interested in the periphery, but 
neither theory nor experience can, so far, prove this. In such situations the 
quadrant sum is by far the fastest to use of any of the tests known to the authors, 
and we believe one of the most useful. 

6. Elementsiry derivations. We can easily find the distribution of 

1. An individual term of the quadrant sum 
a For fixed sample size 

b. In the limit 

2. The quadrant sum itself 
a. For fixed sample size 

b In the limit, assuming asymptotic independence of the four terms. 

This we shall do now, leaving the proof that 2a actually converges to 2b to a 
later section. 

Consider a sample of 2n pairs {xi, yi), ■ ■ • , , y 2 „) from a population in 

which X and y are independent. It is both clear and easily verifiable that 

1. The set of 2n x-values, xt. , • • ■ , X 2 „ 

2. The set of 2n ^/-values, j/i , • • • , f/sn 

3. The permutation of the order of the y- values when the pairs are ordered 
by the a:-values 

which together determine the sample, are indcpcndentlj' distributed, and that 
nnv prrmidatF'r c- likely as every other (We have assumed no tics, vliich 
i- .. ( oil .< .j;,( IK c. V II h probability one, of ihc continuous cumulative distribu- 
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tions of X and y). Since the quadrant sum depends only on the permutation, 
its distribution in the absence of association does not depend on the distribu- 
tions of X and y 

We must solve, then, certain purely combinatorial problems — under the 
hypothesis that the 2nl permutations of the y-values are all equally likely. 
It may simplify matters to assume that the values of x in the sample are 1, 2, • • • , 
2n and that those of y are the same. How, then, do we calculate the distribu- 
tion of a single term of the quadrant sum. Let us begin with small a:-values, 
and the pair (1, 2/i)- If j/i == 1, 2, • • • , n, we count “one” positive, and if 
2/1 = n -f 1, n -f 2, • ■ • j 2a, we count “one” negative. We pass on to (2, 1 / 2 ) 
and so on. How many permutations sneld a count of exactly k positive values? 
Those in which 2 /i , 2 / 2 , • • • , i/* are equal to or less than n, yk+i equal to or greater 
than n •+■ 1, and the other (2n — k — l)y’s are arbitrary. There are: 

n{n — 1) • • ■ (n — fc + 1) • {n){2n — fc — 1) 1 

such permutations, the fraction of all (2»)1 permutations being: 

... n{n — 1) ••• (n — 7c + l)n 

^ ^ (^)(2n - 1) . . . (2n - if + l)(2n - k) 

which is, then, the probability that this contribution will equal -j-k, or by sym- 
metry, the probability that it will equal ~k,k 5 ^ 0 . 

For large n, this becomes merely: 

(2) Pk = k 9^0. 

In order to obtain the distribution of the quadrant sum itself, we must concern 
ourselves with the lack of independence of the four terms. This is indicated 
most clearly in the case of 2n = 2, where the 21 = 2 permutations yield 
-|-1 -|-1 +1+1=4 and — 1 — 1 — 1 — 1 = —4. Here, there is complete lack 
of independence. We shall see later that there is effectively independence in 
the limit, so that it is worth while to calculate the sum of four independent 
terms with the limiting distribution (2) and find that it satisfies: 

(3) Pr{\ mdependent sum of 4 terms | >7b) = ,k> 0. 

The details will be omitted. 

A simple device, reminiscent of Wald’s [3, 1943] establishment of the two- 
dimensional tolerance limits enables us to avoid difficulties with lack of inde- 
pendence and compute the exact distribution of the quadrant sum for any n. 
We decompose the permutation of the 27 -values into the following parts, which 
together specify the permutation: 

1 (a) The number, j, of pairs in the upper right quadrant. 

(b) The set of j values of x between n + 1 and 2n corresponding to pairs in 
the upper right quadrant. 
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(c) The set of j values of y between n + 1 and 2?! corresponding to points 
in the upper right quadrant. 

(d) The set of j values of x between 1 and n corresponding to pairs in the 
lower left quadrant (Note that the use of medians ensures that the 
lower left and upper right quadrants contain the same number of points.) 

(e) The set of j values of y between 1 and n corresponding to pairs in the 
lower left quadrant. 

(f) The permutation of j objects defined by the pairs in the upper right 
quadrant. 

(g) The permutation oi n — j objects defined by the pairs m the upper left 
quadrant. 

(h) and (i) the permutations from the remaining quadrants. 

It IS easily verified that: (1) given j, items (b) to (i) can be assigned at will, (2) 
each assignment of (a) to (i) corresponds to one and only one permutation, (3) the 
quadrant sum depends only on items (b) to (c). In fact, the right hand term 
depends on item (b), the upper term on item (c), the left hand term on item (d) 
and the lower term on item (e). While j remains fixed, the terms behave 
independently. 

For fixed j, what is the distribution of a single term? If a set of j a:-values 
gives the term +&, it must contain the k largest x-values and not contain the 
next. There are: 


/n — h — 

\n-j- l) 

such sets. The generating function for a single term, is, then: 


(4) 



Since the terms are independent for fixed j, and there are (j!)’((n - i)l) 
ways to supply the permutations forming items (f) to (i), the generating func- 
tion for the quadrant sum, , is: 


(5) 0„{x) - ^ ! 


r > , 

/n — k 


/n - k — l\ 



(•if 

ij 

_i 

\n - j 

- 1/ 

\ J - 1 / J 


The exact probability of equalling or exceeding each value of Sn has been 
computed for 2n = 2, 4, 6, 8, 10, and 14. Table 2 gives these probabilities 
and Fig. 3 shows the values of 


- 4- logio Pr( I quadrant sum | > m) 

5 

this particular function being chosen for its relative constancy. The maximum 
value of the quadrant sum is 4n, and for values of k less than 4n - 6, there 



TABLE 2 

Prohability of a Sum of Absolute Value Equal to or Greater than k when a Sample 
of Bn is Drawn from an Unassociated Population 



Variance 

of k 16 24 


* Probability for 2n = » , > 0, is given by 

9fc» + 9fc« + 168fc + 208 


1.0000 1.000000 
0.9115 0.912037 
0.7580 0 754630 
0.6039 0.599537 
0.4690 0.462963 
0.3547 0.346933 
0.2611 0.252025 
0.1876 0.177662 
0 1322 0.121817 
0.0918 0.081471 
0.0632 0.053295 


0.0432 

0.034189 

0.0296 

0.021557 

0.0202 

0.013386 

0.0139 

0.008200 

0.0096 

0 004963 

0.0066 

0.002972 

0.0045 

0.001762 

0.0031 

0.001036 

0.0021 

0.000604 

0.0014 

0.000350 



24 
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is quite good agreement between the curves for finite n and formula (3) at 
the practically significant percentage points. The situation for very small 



Fig. 3. Comparative relationships for finite and infinite sample sizes and 
normal approximation to the infinite sample size 


probabilities suggests a earofiil con.sidcration of the limitmg behavior of the 
quadrant sum distribution (see section 10). 
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The device for samples of 2«, + 1 deserves a word of justification If there 
is no association, the 2n + 1 j^-values are randomly paired with the 2n + 1 
a;-values, and, in particular, the ?/-value paired with the .r-median is randomly 
selected. If we pair it with the (randomly selected) a-value which was paired 
with the |/-median we still have random pairing. The pairing of the 2n pairs 
is random, although neither the a;-values nor the y-values make up a sample. 
The randomness of pairing is all that has been used in the discussions of this 
section. 

7. Extension to higher dimensions. Tlie same ideas that underlie the quad- 
rant sum test for two variables may be extended in several ways to give tests 
for various types of association among three or more variables Only one 
three-variable case will be discussed here, leaving further extension to the 
reader. 

Given three variables, x, y, and z, and a sample of matched observations on 
these, it is clearly possible to use the simple quadrant sum test for two variables 
to investigate association between x and y separately, between y and z separately, 
and between z and x separately. If the Pearson coefficient of correlation were 
being computed and were found to be close to zero for each of these pairs, it 
would be assumed that there was no detectable association through the second 
momenta. In a trivariate normal or Gaussian distribution, where the first and 
second moments determine the whole distribution, if there is independence be- 
tween the separate pairs of variables, there is no possibility of a three-way 
association. It is of some interest, however, to notice that a corner sum tost 
can be devised that will measure the effect of such triple association in case it 
does exist. 

Consider the octants into winch the three median planes for x, y, and z, 
respectively, divide the three dimensional scatter diagram and label the octants 
alternately plus and minus, in the manner suggested by Fig. 4. More precisely, 
an octant is counted as plus if an odd number, that is three or one, of the vari- 
ables are greater than the medians of the sample, and the remaining octants are 
labelled minus. It is clear that we may repeat the process of coming in along 
each axis passing from observation to observation as long as they remain in a 
region of fixed sign, and writing down as a contribution to the final or octant 
sum the number of such consecutive elements and the sign of the region in which 
they were found. There wiU be 'six terms rather than four, as was the case 
for the test based on quadrants, and so a new set of significance levels will be 
required. Table 3, following, lists the situation for a very large sample. 

The situation has been sketched for the case of 2n triples If there are 2a -f 1 
triples, then we may have trouble with the medians again However, a similar 
device works, except that we must agree on a last variable in order to form the 
synthetic triples uniquely. For example, consider the triples (m, 3, 5), (9, m, 1), 
(12, 4, m), where m denotes the median. Takmg the order in which the vari- 
ables are written, we get (12, 3, 5) and (9, 4, 1) as the synthetic triples. Other 



Fig 4 Octant sohematio — solid sections taken as positive 

TABLE 3 


Working sigmficance levels for the magnihules of the octant sum 


Signifioanoe liBvel 

Magnitude of Ootant Sum* 

10% 

,11 

5% 

13 

2% 

15 

1% 

16 

0.5% 

18 

0.2% 

20 

0 1% 

21 


* Computed for large samples only and based on normal approximation, see 
section 11 for discussion of this and higher dimensional cases. 
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orders -would yield (9, 3, 5) and (12, 4, 1) or (9, 3, 1) and (12, 4, 5). This slight 
dissynunetry is not pleasing but should give no difficulty, 

8. Nongraphical example. The folio-wing example of 78 successive observa- 
tions of four variables shows how this test may be applied without plotting and 
how simple the computation still remains. The data concern a metallurgical 

TABLE 4 


Excerpt from Tippett* a Table 


Time T* 

Puel P* 

Material M* 

Articles A * 

Duration D* 

1 - 

240 -t- 

1457 - 

1895 4- 

168.5 4- 

2 - 

196 - 

2078 4- 

2121 4- 

152 4- 

3 - 

192 - 

1278 - 

1437 - 

163 4- 

4 - 

202 4- 

1398 - 

1497 - 

145 - 

5 - 

206 4- 

1944 4- 

1592 4- 

153 4- 

6 - 

1 

218 4- 

1464 - 

1506 - 

147.5 - 

7 - 

155 - 

1541 4- 

1762 4- 

152 4- 

8 - 

201 4- 

1502 4- 

1818 4- 

144.5 - 

9 - 

211 -H 

1950 4- 

1144 - 

151.5 4- 

10 - 

236 4- 

1768 4- 

1654 4- 

161.5 4- 

etc. to 

78 + 

185 - 

1536 4- 

1442 - 

152 4- 

Median 

Median 

Median 

Median 

Median 

39.6 

199 

1474 

1588 

149.5 


* Location of observation relative to column median; + = above; — = below. 

Tippett’s correlations (based on lightly rounded data) 

Tra ~ H" 0.243 

VfA — "t" 0 . 266 
= -1- 0.681 
rru.A = + 0.088 

tfux. = 4" 0.141. 

problem in mass production and are taken from L. H. C. Tippett, Table XXII, 
page 63 [2], An excerpt from the data is given in Table 4 together with Tip- 
pett’s calculated correlations. This table also shows the preliminary marking 
of each individual measurement as above (-I-) for its variable, below (— ), or 
on the median (0) . From this table we see, for example, that increasing T con- 
tributes a term —3 to the quadrant sum for T and D. It is often desirable to 
prepare auxiliary tables to assist in computing the components of the quadrant 
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and hypcrquadrant sums. Such a table is Table 6 for low values of Fuel (F— ) 
arranged in consecutive ascending numerical order. The entries on this table 
for the five columns headed F, T, M, A, and D are directly comparable to the 
entries in Table 4. For example, F = 155 is — with respect to the fuel median 
and T = 7, —\M = 1541 , + , A = 1762, + , D = 162, + . The double, triple, 
quadruple and quintuple headed columns contain simply the algebraic multi- 
plication of the signs in the appropriate T, M, A, or D columns. Thus, TM 
for F = 165 is — , MAD is +, and TMAD is — The contribution to each 
quadrant or hyperquadrant sum is simply the count of the consecutive like 
signs fiom the top of a column. For column AD, we have 7 consecutive + 
signs and since the contribution is to FAD and F is — , the contribution in this 
case to the octant sum is —7. The results from the ten tables of which Table 5 

TABLE 5 


Sample Table for One Component of Quadrant and Hyperquadrant Sums. Low 
Values of Fuel (F— ) 


FuelF 

T 

u 

A 

D 

TU 

TA 

TD 

MA 

MD 

AD 

TMA 

TUD 

TAD 

MAD 

TMAD 

98 - 

4- 









4- 

4- 

4- 



4- 

— 

— 

135 - 

4- 

— 

— 

- 

- 

- 

- 

4- 

+ 

4- 



4- 

- 

— 

140 - 

- 

- 

- 

— 



H" 

■f 

4- 

4- 

B 

B 


— 

4- 

146 - 





+ 

4- 

4- 

4- 

+ 

4- 

- 

- 

- 

- 

4- 

147 - 

+ 

H" 

- 

- 

4- 

- 

- 

- 

— 

4- 

- 

— 

+ 

-t 

4- 

149 - 

- 

4- 

- 

— 

— 

4- 


— 


4- 

4- 

4- 


4- 


161 - 

4- 




— 

— 

— 

4- 

4- 

4- 

4- 

4- 

4- 

- 

- 

163 - 

4- 

— 

4- 

- 


4- 

— 

- 

4- 

— 


4- 


4- 

4- 

165 - 


4- 

4- 


— 

— 

— 

4- 

4- 

4- 




4- 



Contributions to Sums 

FT FM FA FD FTil FT A FTD FMA FMD FAD FTMA FTMD FT AD FUAD PTUAD 

-2 4-4 4-7 4-8 4-2 4-2 4-2 -4 -4 -7 -2 -2 -2 4-4 4-2 


is a sample are then carried to the summary computation shown in Table 6. 
The contribution from Table 5 is shown on line F - The totals are computed 
and their probabilities of occurrence determined. 

9. Serial example. The following example, a sample of 144 observations of 
the thickness of inlay for relay springs cut consecutively from a single sheet of 
material, allows us to compare the resolution of the present test with that of 
the serial product-moment correlation The data ai'e from Shewhart [1, 1941, 
Table 1] and the serial correlations from lag 1 b> lag 22 are from recent calcu- 
lations by Miss Dorothy T. Angell. Tl.e profcdnre for calculating the serid 
quadrant sums is sumlar to that for obtammg the sums for section 8. A table 
is prepared to show the observed consecutive order of the numerical values and 
each is identified as above (4-), below (-), or on the median (0). This gives a 







TABLE 6 

Summary Computation Table for Quadrant and Hyperquadrani Sums 
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From Table 

+ 1 

F + 

F - 

+ 1 

+ 1 

+ 1 

Totals Quadrant Sums. 

Octant Sums 

Hexadeeant Sums. . 

Dotnacontant Sums . . . 

Probability (%) < ... 

Significant at 5 % . ... 

Sigmficant at 1% . . . 

Sigmficant at 0 2 % , 
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tab e similar to one of the elements, say Fuel, in Table 4 Four computa1,ion 
tables similar to Table 5 are required, one for the equivalent of moving from 
the right, one from below, one from the left, and one from the top of a lag cor- 
relation scatter diagram One table from each direction will take care of all 
ags n the first, the margmal entries are the observed values listed in descend- 
ing numerical order, Opposite these are recorded from the previous table the 
signs associated with observations for each lag with respect to each entry. 
The second table would record the signs relating to the lags from the observed 
values arranged in ascending order The third table would record the signs 
relating to leads from^ the observed values arranged in ascending order and the 
fourth, the signs relating to leads from the observed values arranged in descend- 
ing order The sign of the contribution from each group is the algebraic product 
of the sign of the run and the sign of the margmal entries. The length of run 
is determined in the same way as in Table 5. Table 7 illustrates the procedure 



• Contribution to Serial Quadront Sum . 


of determining the contribution from lags associated with the observations 
arranged in ascending order. 

Two serial quadrant sums may be computed — a circular serial quadrant sum 
or a noncircular serial quadrant sum. Circular items arise from considering 
that the beginning of the set of observations is a continuation of the end in the 
same way that this assumption is made m computing circular serial correlation 
coefficients. In Table 7, circular items are shown in parentheses and are omitted 
in calculating noncircular sums. In the particular table shown, the count of 
the run lengths was identical for both types of sum, but in other cases this may 
not be the case. Smee the serial quadrant sum is relatively insensitive to 
sample size, the nonch-cular serial quadrant sum has for all practical purposes 
the same distribution as the circular quadrant sum. The correspondence in 
this case between the serial correlation coefficient for each lag up to 22 and 
the respective values of the two types of serial quadrant sums is shown in Fig. 5. 


3' 
















510 


PAUIj S. OLMSTEAD and JOHN W. TUKBY 



0 10 20 30 40 

LAG 



O 10 20 30 40 


LAG 

Fig 6. Comparative peiformance on a serial (autocorrelative) example 

10. Convergence to the limiting distribution. We shall consider several 
chance sums. One of these is S, which has the limiting distribution discussed 
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in section 6. Another is ;Si, , which is the sum of four independent terms, each 
distributed according to the limiting distribution curtailed at Its generat- 
ing function is 


Gk{x) = + E2-“+«x-'y. 

The total probability assigned to Sl= -k, -{k - 1), ■ ■ ■ ,h,ia less than unity, 
so that there is nonzero probability that S'k is not defined. The third is , 
the quadrant sum itself, whose generatmg function is (6), and the fourth is the 
result of the same sort of curtailment applied to Sn ■ It will be denoted by 
Sn,i. and its generating function is 


Gn,h{^) = E 


(2n) 1 


^ * /n — i — l\ * 

£ ( . . + 2 

\ll — J — 1/ 


n — i — 1^ 
i - 1 j 


or 


This again corresponds to a total probability less than unity. 
It is clear that 


FKS„,h = m) < Pr(S. = m) 

and 

Pt(Sh = m) < Pr(*S = m). 

We shall soon show that 

(6) lim Pr(^S„,i = m) = Pr(iSi = m) 


and this will imply that 

lim Pr((S„ = m) = Ft {S = m) 

n — *00 

which is the desired result. The imphcation runs as follows: given e, we can 
choose h so large that 

Pr(»Sjb defined] > 1 — «/3 


whence 

1 Pr(»SA = m) — Pr((S = m) \ < e/S 

and then choose n so large that « 

1 Pr(iS„.* = m) - Pr(& = m) | < €/(24h + 6) 

for m = — 41fc, — 4fc 1, • ■ • , 4ft 


Pr(iSn,«i defined) > 1 — e/3 — 


8fc -b 1 
24ft -b 6 


e < 


1 


16ft -b 3 
24ft -b 6 ® 


whence 
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and hence 


Pr(5„,^ = m) - Pr(S» = m) | < . 


this inequality holding automatically for j m | > 4k. Hence, 

I Pr()Sift = m) — Pr(>S = m) 1 

< 1 Pr(S„ = m) - Pr(>S„.i = m) | + | Pr(;S,a = m) - Pr(^A = m) | 

+ UMs; - - Pr(S . «.) I < + J. < . 

This method is clearly of general application in such problems. 

We turn now to the proof of (6). The expression for Gn,k{x) shows that’we 
may consider it the result of the following process' -the integer j is a chance 
quantity with the distribution 

For fixed j, On.k ia the average over / of 

r A — f — l\ A — 1 — l\ T 


GnM = E 


The first of these relations shows that j/n converges stochastically to ^ as n 
approaches infinity. The second shows, since 




n — i — 1 
— / — 1 


n — i — \ 
,1-1 


(n - i - l)!(n - j)'j' {n - /)(/)(/ - 1) ■ ■ • 0 - ^ + 1) 


(n — j — !)!(/ — ^)!?i! n{n — ])(n — 2) ■ ■ . (n — f) 


^ (n - i — l)l(n - j)'j' 
{n - 3 — i)l(j — l)!nl 


= ~ J)(^ - i - 1) • • • (n - j - f + l)j 

n(n — 1) . . ■ (n — i) 

and both of these converge stochastically to as n approaches infinity, 

that Gn.h.jix) converges stochastically to Gk{x). Since these curtailed generat- 
ing functions involve only powers of a; in the finite range between — and 4- 4k, 
the limiting relation (6) follows at once 

11. Effectiveness of normal approximation. Fig. 3 shows the relation be- 
tween the asymptotic distribution of the quadrant sum for large n and a normal 
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•distribution with variance 24, i.e , the same variance as that of the asymptotic 
distribution. The normal approximation is calculated from 

Pr(l (Sfn I > w) Pr > ^^7^^ 

where -t is normally distributed with zero mean and unit variance. The asymp- 
totic and normal curves agree surprisingly ivell out to the 5% point, and an error 
of a full unit in the significance level first occurs beyond the 0.5% point. 

Since the asymptotic distributions for the quadrant, octant, hexadecant, do- 
triacontant, — , sums become more and more normal, the normal approximation 
will be even better for higher dimensions In r dimensions, this approximation 
consists in treating 

Vl2r 

as the absolute value of a standard deviate. This should be quite adequate for 
large samples and r > 4. 


12. Unsolved problems. The central unsolved problem in connection with 
the quadrant sum is: 

(1) What is the operating chai’acteristic? 

This has as a corollary the more general question: 

(2) How can the operating characteristic of a nonparametric test be de- 
scribed so as to be useful to the users of the test? 

There are, of course, minor problems which are much more easily soluble. A 
few, listed in order of practical importance, are: 

(3) What is the effect on the significance levels of the use of lagged values 

of X as values of y? 

(4) What are the exact distributions for moderate n in three or more dimen- 
sions? 

(5) Do the analogous limiting distributions hold for three or more dimen- 

sions? j. j 4 - 

(6) What is a better approximation to the limiting distribution for moderate 

n? 

To encourage others to solve some of these, we close with the assurance that 
they have our good wishes. 
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DISCRIMINANT FUNCTIONS 

By Geohge W. Brown 
Iowa State College 

1. Introduction: In the following sections the development of discriminant 
function techniques is approached from an elementary point of view, considering 
first an essentially trivial problem, then working up to the more complex situa- 
tions which may be handled by discriminant function methods. No attempt 
has been made to follow the pattern of the historical development in this process, 
and no consistent attempt has been made to allocate proper credit, in the text, 
to those individuals responsible for the introduction and exploitation of these 
methods. A more or less exhaustive bibliography of discriminant function 
applications and related theory is given at the end of this paper. 

Some historical perspective may be gained, however, from a very sketchy 
consideration of the early background of the subject. The first published 
application of the discriminant function seems to have been the work of Barnard 
(1935 [Ij) on craniometry, following the suggestion of R. A. Fisher. Meanwhile 
Pi C. Malialanobis (1927, [30]; 1930, [31]) and, m this country. Hotelling (1931, 
[25]) had been concerned with a closely related problem, the construction of 
measures of the “distance" between two sets of multiple measurements, for which 
Karl Pearson’s (1926, [34]) coefficient of racial likeness was not wholly adequate. 
Fisher (1936, [18]) gave a further example of the method and showed (1938, [19]) 
the relation between his work and that of Hotelling (1931, [25];1936, [27]). Thus 
the theory of discriminant function analysis proper is about ten years old, but is 
intimately related to researches which go back a few more years. 

A simple problem; Consider the very simple case of a single measurement, say 
which may be made in each of two populations, and let us suppose, for the 
salce of discussion, that J is normally distnbuted, with unit variance, in each 
population, but with possibly different means in the two populations. 

Let 


Ei(?) ^ a - p 

Ei{^) = a P 

he the mean values of ( over the two populations, with /3 > 0. As an example, 
we may consider the pli measurements of Iowa soil samples (Cox and Martin, 
[12]) , for two soil populations, distinguished by the presence or absence of Azoto- 
baoter. From 100 samples containing Azotobacter and 186 samples containing 
no Azotobacter, We have the estimated averages of pH equal to 7.423 and 6.015 
respectively, with an estimated standard error of .625 within populations (see 
Fig. 1). 
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a = 6 719 
$ = .704 
S' = .625 
^/S = 1.13. 

Let us suppose further that ^ is the only measurement available on a single 
individual, not knowing to which of populations 1 and 2 the individual belongs. 

tiistnbuti'on of pH Measurements 



The problem is to classify this individual as a member of population 1 or popula- 
tion 2. It is clear that ? furnishes the only information on which to base a 
decision, and that essentially the only procedure available is to choose a number, 
say fo , such that we choose population 1 when £ < and population 2 when 
g > fo ■ Furthermore, it is evident that the expected accuracy of classification 
depends on the size of /?• If we wish to have equal risks of misolassification for 
members of the two populations we choose fo = a. Then the probability of 
misolassification is given by P{£ > /3}, where £ is a normal deviate with i^t 
variance. As one would expect, the probability of nusclassification tends to 0 as 
j8 -> 00 and tends to ^ as ^ 0. In the Azotobacter example, if we assume 

that the estimates given are the population values, we choose fo = 6,719. The 
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ratio jS/ff = 1.13 is exceeded approximately 13% of the time in sampling from 
the normal distribution, leading to .13 as the probability of misclassification. 

Consider now the .slightly more general situation in which wo consider a fixed 
variate, say w with measurements ^ distributed, for fixed w, with a mean of the 
form a -fi This is the standard regression situation. As before assume 
that £ is normally distributed about this mean with unit variance, that is 

^ = a fiw + e 

where a and (3 are constants, lo may take on any or all real values, and e is a 
normal deviate. Note that if 'tu is restricted to take on only two values the 
structure reduces to the first structure considered. An example of the continu- 
ous type might be constructed by considering w as genotypic yield of grain and § 
a phenotypic measure of yield (Smith, [36]) 

The simple problem formulated for the Lwo-population case may be reformu- 
lated heie as follows: Given the relationsliip f = a -j- -1- «, and given f for 

an individual for which no other information is known, how shall we estimate la? 
Tor selective breeding the problem may be to select individuals for which w is 
at one end of the scale, rather than to estimate w itself. Whatever decision is 
to be made, it is still clear that S furnishes the only available information, and 
that the certainty of the decision is a function of 13. Since — a)//3 = w + e/^, 
the variance of this estimate of w is 1/0^. Note that confidence intervals for w, 
given S, may be constructed from the normally distributed quantity f — a — ^w. 
It should be pointed out that m the usual regression case we are interested in 
predicting ? for given w, with the hypothesis as stated above, whereas in this 
case f will be observed, and the problem is that of estimating, as a parameter of 
the distribution of f, the fixed variate w. 

Obviously /? must not.vamsh if £ is to perform any discrimination among w 
values. In practice, of course, a and /3 will not be given as known values and the 
variance of « will not be known, but a finite set of observations may be available, 
for which to values are known and ^ has been observed The usual analysis of 
variance provides a significance test for the non-vanishing of which is equiv- 
alent to testing for the significance of the regression of £ on w. 

It is to be noted that this analysis reduces to the conventional between-within 
analysis (T or i-test) when we have the special case of two populations. More- 
over, if we had treated ^ as the fixed variate instead of w, and considered the re- 
gression of IS on the Analysis of Variance would have differed only in replacing 
2(f — (y throughout by S(iw — wy and the relevant T-test would have been un- 
changed. 

When probabilities of misclassification are estimated from finite samples, as 
in the soil classification example, there are three sources of error, sampling error 
in the estimate of the separation value fo , sampling error m the estimate of the 
distance between the population means, and sampling error in the estimated 
standard deviation of S within populations. It does not appear difficult to set 
up confidence intervals for the probability of misclassification, assuming repeated 
classification of individuals given fixed initial samples. 
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2. The one -dimensional discriminant function. We have been dealing so far 
with the simple situation in which only one measurement per individual is 
available for purposes of discriir matron Suppose we still have this measure- 
ment, call it , now, but we have other measurements as well, say , • • ■ , . 

As before = ai + fiw + . For the moment suppose that the remaimng 

measurements have mean values independent of w, so that 

= Olm + Cm , (m = 2, • ■ • , p), 

and let us assume also that the ( e„.} are mutually independent, (m = 1, 2, • • ■ , p) 
and are normal deviates with unit variance. It is safe to assume that nobody 
would ever argue, in this case, that the measurements , ■ ■ • , kv , provide 
information about the w value for an individual If, then, we were so fortunate 
that we weie in this situation, and knew so, we could say that is our dis- 
criminant function, since, if any discriminating is to be done, has to do it. 


TABLE 1 


Analysis of Variance for Regression 



d.f 

Sums of Squares 

Regression 

■RH 

r'sa - lY 

Error 

BB 

(1 - r^)S(f - 1)2 

Total 

N - 1 

S(« - lY 


s(? - - ro) 


VsCf - lYx{w - wY 


Suppose, now that the measurements Si , ? 2 , • ’ • , are not explicitly avail- 
able, but that we are able to observe a linearly equivalent set aii , aiz , ■ ■ • , a;^ , 
related to the {|m} by the transformation 


where the Imn are unknown. For fixed w, Xm has expected value 

P 

ImnCCn ~t~ ~ 


SO that in general each observation provides information about wj. More- 
over, the Xffi are not in general mutually independent; it is evident that the 
population matrix of variances and covariances for fixed w is given by tTmn = 

^ V Imk^nJe • 


As an example of a set of correlated measurements, consider the Azotobacter 
example referred to above. In addition to pH values, determinations of avail- 
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able phosphate content and total nitrogen content were made on soil samples 
in each of the two populations. Means were as follows: 

■pH Phosphate Nitrogen 

Mean of 100 samples with Azotobacter 7.423 133.120 29.400 

Mean of 180 samples without ” 6.016 51.113 21.140 

Mean difference 1 . 408 82 , 007 8 . 260 

Clearly the differences are proportional to the hypothetical b„'s. The variance- 
covariance matrix, estimated from the 284 degrees of freedom within populations, 
is given by Talile 2. 


TABLE 2 



pll 

Phosphate 

Nitrogen 

pH 

111.0879 

2,292.7192 

198.4026 

284(cr„„) = Phosphate 


1,042,799.1890 

5,066.2645 

Nitrogen 



29,422.3656 


Estimated correlation coefRcienfcs within populations are not large, .213 for pH. 
and Phosphate, .110 for pPI and Nitrogen, and .029 for Phosphate and Nitrogen. 

Another example is furnislied by Eishcr’s Iris measurements [8], provid- 
ing sepal length, sepal width, petal length, and petal width for each of 50 
individuals of Iris setosa and 50 individuals of Iris versicolor. This example is 
an unfortunate one in that either petal length or petal width alone is sufficient 
to discriminate the two populations as completely as anybody has a right to 
expect anytime. The petal lengths, for example, vary between 1.0 and 1.9 cm. 
for the 50 setosa, and between 3.0 and 5.1 cm. for the 50 versicolor. 

Let us proceed, under the assumption that available measurements, Xm , 
are distributed normally about mean values + imW, with variance covari- 
ance matrix o-mn for fixed w, keeping in mind the underlying model of , ^ 2 , • • ■ , 
, with 

Xm — i fn , ?L = + /3W 4" «l , fa = Kz + Ca ; ' ' ' J fp = <Xj> + «jj ■ 

The skeptic may wish to grant the first part of our assumptions without grant- 
ing the hypothetical structure of f’s underlying the k’s. HoteUing’s work [27] 
shows that such an underlying structure of f’s may always be provided, given 
the distribution of x’s for fixed w. In other words, a distnbution of a:’s for fixed 
10 leads essentially uniquely to an underlying J model. 

The discriminant function, given O'mn } CCm and bm , for m, n, = 1, 2, • • • , p, 
is 
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^ = 12 Ux„ 


where 


= X) and o-’"” 

msl 

is the reciprocal matrix to (r„n . That is cr”" are the solutions of the linear sys- 
tems [17J 

£ = 0 if m = n; m,n, = 1,2, ■■■,p 

■V' 

c^Vam = 1 ; m = 1, ■ • • , p. 

That X, as defined above, is properly called the discriminant function will be- 
come evident immediately. Putting x„ = , we have 


X = p J2 

m,n.h. 

Recalling that the v"" are reciprocal to £r,„„ = LkU , it can be seen that 

k 

c^^lmilnk = 1 if /c = 1, and vanishes for A; = 1. It follows that 

mn 

X = 

in other words, X calculated as Xj a'”"'imXn from known population quantities 

mn 

is proportional to the hypothetical , the only one of the underlying measure- 
ments which is related to w, thus justifying the term discriminant function for 
X. It is clear that any other linear function of the x’s is also a linear function of 
the I’s, and can discriminate, at best, only as well as X itself, since all the ^’s 
are independent of w, with the exception of |i X itself discriminates w to the 
same extent that , were it available, would discriminate 
The degree of discrimination of w’s depends, as indicated in the previous sec- 
tion, on the ratio of the mean square of ^ , among w’s (mean square for regres- 
sion), to the mean square of for fixed w (mean square for error). Since X 
is proportional to fi , the same is true when X is substituted for . It turns 
out, of course, that X is that linear combination of k's for which the ratio of the 
mean square for regression to the mean square for error is a u..i\ii i.'ii, or, ■ 1 (li 
IS the same thing, X is that linear combination of X's whicl i i ' (' a a\ii> i.i'i 
correlation with w. From any point of view X appears to be the logical function 
of a;’s to compute. It is clear that XX is precisely as good as X, if X is any con- 
stant. 
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In the two population case, where w takes on only two values, X is evidently 
proportional to — tivi2)x„ , where and are the mean values of 

Xm in the two populations. X is here the particular linear combination of k’s 
for which the ratio of the mean square between populations to the mean square 
within populations is a maximum. The value of this ratio, which measures the 
degree of discrimination possible, depends on the spread of the means oiX 
between the populations, or in general, on the spread of the means of X over some 
given distribution of w’s. Given (r,„„ and 6,,, the larger the spread of w values 
the better overall discrimination will be obtainable. On the other hand, the 
coefficients for X depend only on and ■ 

Since X is proportional to , it follows that the discriminant function is in- 
variant under non-singular linear transformation of the a:’s, that is, if some set of 
y’s, linearly dependent on the re’s, had been observed, together with their means, 
variances and covariances, the discriminant values would not have changed. 
This invariance is obviously a desirable property, and as such was one of the 
goals of Usher, hlotelling, and Mahalanobis. One more property of the dis- 
criminant function is of interest; X is essentially equivalent to the maximum 
likelihood estimate of lo. 

In our statistical model w plays the role of a fixed variate or population param- 
eter, and the x’s have a joint distribution about linear functions of w as means. 
Suppose now that (o-mJ and {&,„} are estimated from an analysis of variance 
and covariance on data for which w as well as x values are known. The problem 
of estimating w for a single individual whose x measurements are given resolves 
into a two-stage estimation process, the first stage being the estimation of 
and (&„,} from the initial data, the second stage being the estimation of w 
by the discriminant function whose coefficients are computed from the es- 
timated (vmn) and {tm) . It has already been pointed out that X is the linear 
combination of x’s winch has greatest correlation with w It turns out, then, 
that the coefficients of X are proportional to those which would have been ob- 
tained from a formal regression analysis of w on aii , 3:2 , ■ • • , Xp , considering the 
a;’s as independent variables and w as dependent variable, a direct interchange of 
roles as compared with the statistical model we have assumed. Of course two 
linear functions differing only by a factor of proportionality are equivalent in 
discrimination. If the formal analysis of variance is carried out for testing the 
significance of the regression of w on .1:1 , *2 , • • ■ , a:„ , the relevant F ratio re- 
mains a valid lest for the non-vanishing of the &m m spite of the inversion of 
dependent and independent variables. The' analysis of variance is given in 
Table 3 . 

B is, of course, the conventional multiple correlation coefficient An equiva- 
lent analysis can be carried out for X itself, allowmg sufficient degrees of freedom 
for the estimation of the constants in X, as given in Table 4 

This analysis is proportional to the analysis given above. It might be noted 
that the mean square corresponding to error sum of squares in this analysis is 
which is X evaluated for Xp = bp, (n = 1 , 2 , • ■ • ,p). 
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In the Azotobacter example, Cox and Martin arrive at a discriminant function 
which has the analysis given in Table 5. 

It IS evident that the difference be tween populations is highly significant . The 
choice of scale for X in this case forces the sum of squares within populations to 
be equal to the difference between the mean X values for the two populations 
Thus the mean X differs by .021777 for the two populations, and has an esti- 


TABLE 3 

Analysis of Variance for Regression 



df 

Sums of Squares 

Eegression 

V 

R^2(w — wy 

Error 

N — p — 1 

(1 - - wy 

Total 

N - \ 

2(u) — wy 


TABLE 4 


Analysis of Variance for X on w 



df. 

Sums of Squares 

Regression 

V 

R^it{x - xy 

Error 

N -p-1 

(1 - E‘“)2(x - xy 

Total 

N -1 

s(z - Xy 


TABLE 5 

Analysis of Variance of Discriminant Function 



df. ' 

Sums of Squares 

Msan Square 

Between populations 
Within populations 

3 

282 

.030842 

.021777 

.01028 

.00007722 

Total 

285 




mated standard error, within populations, equal to V.00007722 - .008 88. 
Half the difference, divided by the standard error is the normal deviate cor- 
responding to misclassification, if equal risks are taken. In this case the value 
of the normal deviate is 1.24, approximately, leading to an estimated probab i y 
of misclassification of about 11, which is not very much better than the .13 
which one would have obtained if pH alone had been used. . .. - 

In this problem, as in conventional regression analysis, it is temptmg, fo 
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various reasons, to consider the possibility of using smaller sets of classifying 
measurements. Moreover, a significance test for this situation is in general 
more interesting, as a practical matter, than the significance test for differences 
among populations, since the initial presumption is that we are interested in 
being able to discriminate, on the basis oi Xi, Xt , ■ • , Xp , Suppose, for ex- 
ample, we wish to test whether the discriminant function Xip) based on xi , 
, ■ ■ ■ , aij, is significantly better than the discriminant function based on 
Xi , • ‘ , Xr , with r < p. The relevant test is precisely the same as the test 

TABLE 6 


Analysis of Variance for Rejecting Xr+i , ■ ■ ■ , Xp 




Sums of Squares 

df 

Sr 

Regression on 

Xi, ,Xr 

r 

Si 

Regression on 

> * * * » j ®r+l > ' ' * j 

p 

S', - 

Difference 


p ~ r 

Cf2 q3 

Or — 0 

Error 


N - p -1 

si 

Total 


N - 1 


TABLE 7 

Analysis of Variance for X = Xo 



Sums of Squares 

clf. 

Sp 

Regression on Xq 

1 

Si 

Regression on aii , • • • , a;. 

P 

Si - Si 

Difference 

p - 1 

- si 

Error 

1 

1 

si 

Total 

N - 1 


calculated formally from the regression of w on the sets xi , • • • , x^ and Xi , 
Xi, ■ ■ • , Xp , with the analysis of variance given in Table 6. 

Similarly, if we wish to test for the significance of a theoretical discriminant 
function, Xo , with preassigned coefficients, as compared with , we have 
again the conventional test calculated from the formal analysis of the regression 
of ui on (Ti , aiz , • ■ ■ , Xp , BiB given in Table 7. 

As shown by Fisher [21] the relevant F-Test for this hypothesis i& computable 

_ n — p + 1 R'^ 

p - 1 1 - R'^ 


as 
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wkere R'^ — -R^(l — r^), r is the correlation between X and Xq for fixed w, and 
R is the multiple correlation for w on * 1 , • • ,Xp, or, what is the same thing, the 
correlation of w and X. 

The example of Smith [36] is an example in which the relationships of a;’s 
to w have to be estimated from analysis of variance and covariance of data in 
which the w’s are not really known, being related to genotypes. The regression 
of k’s on w is estimated by a generalization of the components-of-variance 
method, from variance-covariance analyses in which the usual null hypotheses 
are significantly contradicted. The net effect is that the usual significance 
tests now fail to hold, although the algebraic calculations are formally equivalent 
to those given above, once the population relations of x’s to w are established. 
When work of this kind is based on small samples, there is some difficulty m 
estimating the reliability of the results 

3. Multi-dimensional discriminant functions. Instead of trying to discrimi- 
nate between two populations or estimate a single parameter w, our problem may 
be to discriminate among several populations, not necessarily linearly related, 
or to estimate many independent parameters wi , , • ■ ■ , w, . Just as a single 

parameter w is sufficient to distmguish between means of measurements for two 
different populations, s parameters are sufficient to distinguidi between means 
g 1 different populations, and exactly s parameters will be required, if 
no linear relation obtains among the s -f- 1 populations. For example, with 
three populations, any measurement mean may be given the three possible 
values a, a -t- /?, a + 7. corresponding to = ^2 = 0 for population 1, Wi = -1, 
W 2 — 0 for population 2, and lOi = 0, iW 2 = 1 for population 3. Geometrically 
we have to consider a set of parameter values as a point in an s-dimensional 

The one-dimensional discriminant function admits two very different general- 
izations in higher dimensions. The practical solution to a particular problem 
for which s is moderately large may involve a mixture of both generalizations. 

Let us generalize our statistical model before discussing the discrimination 
nroblem To avoid complication of algebraic natation, let us for the moment 
assume s = 2. We will now postulate a set of hypothetical measurements 

fi , a . • • • . f 

?i = 

= ai2 -|- -h 72^ + '2 
i 2 = as + «» 


fp = «» + 
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where the «p are independent normal deviates with unit variance, u and v are 
fixed variates or parameters corresponding to the different populations, and 
ax , a% ■ , dp , / 3 i , /Ja , 7i , and 72 are constants. Evidently ^3 can 

yield no information about u and v, and ?2 together contain all the information 
there is to get about u and v. As before, assume that our data will be m the form 
of linear combinations Xm = , with unknown coefficients Zm„ . The 

variance-covariance matrix within populations, or for fixed u, 1;, is still given by 
amn = SZ„aZnA • Tlic mean values of the a;’s for fixed u, arc given by 

Ei^m) = 2 ZmnQ!n + 0 mA + lmA)u + (Zml 7 l “H ImsCi^V 

Am -f- hmXf’ “h CmP^ 

This model is again justifiable on the basis of Hotelling’s work. 

The first question to ask is whether we can now form two linear combinations 
of the ai’s and get rid of la , ■ ■ ■ , Ip in both, thus providing a two dimensional 
description of an individual on the basis oi Xt , , ■ • • , Xp . The answer here 

is in the affirmative, as a result of a direct generalization of the method dis- 
cussed earlier. If we calculate Xi = Str”‘"6ma:„ and X2 = So-”*"cma:n , we are 
fortunate enough to get 

A’l = /?i|i -fi ftia 
X2 = 7 ifi + Ysfa 

with no disturbing elements from la , • • • , Ip . Assuming for now that Xi and 
^■2 are not merely proportional, i.e. /3i72 — 0, what do we do with Xi and 

X2? 

For fixed u, v, we have 

E{Xx) = 2 ( 7 ”’’ 6 man + UZ<T”'%J>n + VXa”'"bmCn 
= Ax -f- Biu -j- CiV 

EiXi) = Sv’""C'ma„ + «S<r’""Cm6„ -f- i;S<r’"’’CmC„ 

= A3 “f- B2U -j- C3V 
and variances and covariance 

Til = S(r*""6m6n = Bl 

Tx2 = Scr""'6n,Cn = Cx — B3 
T22 ~ So" Cm^n “ G3 . 

We may for example, estimate u and v by solving the equations 

BxU -i" Ciit = Xx — Ai 
B3U "h G-p ~ Xz — A3 , 
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or we may set up regions in the Xi , Xi plane for which certain decisions ar& 
made. For example, when classifying an individual into one of three popula- 
tions, we might delineate regions, as in Pig, 2 , 

1 hen the particular individual would be classified as coming from population I, 
II, or III, according to which region Xi, Xi falls in. The individual points 
shown in the figure represent the expected values of Xi, X2 for each of the three 
populations. No exhaustive investigation has been made for this situation, but 
some fairly obvious methods are available for constructing such regions 
With respect to sigmficance tests when the , b,„ , c™ are estimated from 

samples, the whole gamut of multivariate analysis has to be run Tests ana- 
logous to (but more complicated than) F tests exist for testing the significance 

Classification Reqfons in Si, Plane 



of the discrimination, the sigmficance of a subset of the a:’s, and the significance 
of a theoretical pair Xi.a , X2,o (Wilks [ 41 ], [ 42 ], [ 43 ]) 

For some purposes a two-dimensional discrimant function Xi , X2 may be 
unsatisfactory. For example, wo might suspect that ^172 = /laTi (or that the 
relationship is nearly satisfied). Under these circumstances Xi is (nearly) 
proportional to X2 , and we would like to compute the best one-dimensional 
discriminant function, even though we have started with two linear parameters 
u and V . Even if ^172 5^ ftTi we might still ask for the best one-dimensional dis- 
criminant function, in order to rank our populations on the “best” linear scale- 
If wo define Y as that linear combination of ni , a;2 , ■ ,Xp which has the largest 
multiple correlation with u and v, we have generalized the simple one-dimen- 
sional discriminant function in a second direction. 

Before proceeding, it is useful to recogmze that Y , as defined above, must be a 
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function of Xj , Jf2 , since Xi and X2 together contain all the information about 
u and V that can be obtained from the a:’s. 

Now suppose we consider an arbitrary linear combination F = XiXi + X2X2 . 
F correlates best with 


Xl(rnR + Tnv) + X2(Tl2'a + T22 w) = (XlTn + X2Tl2)w + (X2ri2 + \2Til)v. 

We now have to choose Xi and X2 to maximize this correlation. This correla- 
tion will bo maximized if we maximize the ratio of the variance of 

(XiTii -f- MTn)u d” (Xir]2 + X2T22 )w 

(over the distribution of u and v values) to the variance of F for fixed u and v. 
Call the first quantity Si , the second & . Then Si == Xi“tu -f- 2 XiX2ti 2 + X2V22 
and Si is of the form Xi -j- 2X1X2M12 d~ X2 /122 where 

fin ~ Tiiffitu + 2TiiTi20-„e -j- TiilTvv 

Mil ~ ni'TuO'uu -f- (ti 2* 4" ■rilT22)cruD + ri2T22Vti» 

flu = Tllduu 4" 2Tl2T22<rui) 4" T22^0’vi, . 

Maximizing SifSi leads to the equations: 


i.e. 


Xi Til 4 " X2T12 = ^ (Xifiii 4 " X2M12) 

Xl Ti2 4 " X2 T22 = (Xj Ml"! + X2 Mt^ 
02 


Xi(Tn — 0 Mw) 4 " X2(ti2 — 6/112) = 0 


Xi(ti 2 — "h X2(r22 — 6/122) = 0, with B = Si/ St . 
It is thus seen that 0 must satisfy the quadratic equation 
(tii — dMi-dijn — 6 m 2 i) ~ {ti 2 — 6 mi2 )^ = 0, 


in order for solutions Xi, Xi to exist. In general there will be two solutions, of 
which the greater corresponds to that linear combination XiXi 4- X2X2 which has 
greatest multiple correlation with u and v, whereas the smaller corresponds to 
that linear combination which has least multiple correlation with u and v, 
e itself corresponds to R^/(l — R'^) for the regression of XiXi 4 - X2X2 on u, v. 

In the general case with s degrees of freedom corresponding to Wi , W2 , • ■ • ,w, , 
there is an s-dimensional discriminant function (Xi , X2 , • • ■ , Xj), and a set of 
8 linear combinations for which R^ /{I — R^) is stationary with respect to 


Xl , • • • , X, . 

The s roots (corresponding to an equation of degree s) arranged in decreasing 
order, permit construction of the best one-dimensional, two-dimensional, • ■ • , 
(s — 1) -dimensional discriminant functions. 
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Discussion of the relevant significance tests for these reduced discriminant 
functions is beyond the scope of this paper. Reference may be made to the 
work of Hotelling and Fisher. 
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NON-PAEAMETRIC ESTIMATION II. STATISTICALLY EQUIVALENT 
BLOCKS ANB TOLERANCE REGIONS~THE CONTINUOUS CASE 

By John W. Txjkby 
Princeim University 

1. Summary. Wald [2, 1943] extended the usefulness of tolerance limits to 
the simplest multi-dimensional cases His principle is here used to provide 
many new ways of using a sample of n to divide the range of the population into 
91 - 1-1 blocks of known behavior. The exact tolerance distribution for the 
proportions of the populs^tion covered by these blocks is extended from the case 
of a continuous probability density function to the case of a continuous cumula- 
tive distribution function. Such an extension is needed in dealing completely 
with multivariate cases even where the underlying distribution is as smooth as a 
multivariate normal distribution. 

The devices used in Paper I [1] to extend the usefulness of tolerance limits to 
the case of a discontinuous underlying distribution will be applied in the next 
paper of this series, with some extension, to extend the usefulness of these gen- 
erd tolerance regions to the case of a discontinuous distribution. Some of these 
results specialize into new results for the univariate case, although they do not 
seem to have any immediate practical application. 

The author wishes to acknowledge the stimulation given to his work on this 
problem by Henry Scheffii, whose modesty has kept this paper from the joint 
authorship of papers I [1, Scheffd and Tukey 1945] and IV (not yet written). 

2. Introduction. Wald’s great contribution to the theory of tolerance limits 

was his method of successive elimination. As originally presented for a bi- 
variate situation it ran roughly as follows Let (aii , yO, (xt , 2 / 2 ), • • ■ , (a^n , 2/n) 
be a sample of n from an arbitrary bivariate population The type of tolerance 
region to be used is determined by four preassigned integers, h , fe , fca , and 
ki The procedure is as follows: Order the n observations according to their x 
values. Select the ici highest, and let the a; coordinate of the lowest of these hi 
be . Select the fc- lowest, and let the x coordinate of the highest of these 
h be a; . Discard these h -f h selected observations, and order the remaining 
^ observations according to their y values. Select the h highest of 

these remaining observations, and let the y coordinate of the lowest of these k^ 
be ?/„ . Select the h lowest of these remaining observations, and let the y 
coordinate of the highest of these ki be yi . The tolerance region, consisting 
of all points (:r, y), with < x < x. and yi < y < y^ depends on the samge, 
and, hence, so does the fraction of the population falling in (= covered by) this 
region Wald showed that the distribution of this fraction covered was in- 
dependent of the underlying bivariate distribution, so long as this latter to- 
tribution had a continuous probability density function. He showed that the 
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distribution was the same as that arising in the ono-cliinensional case when a 
tolerance region was set with the aid of ki + 7c2 + hi + h observations. (Nu- 
merical approximation to these distributions will be discussed in Paper IV of this 
series. 

The important device in this process, and the one which makes the conclusion 
possible, is the discarding of the ki + h observations after they have played 
their part by determining Xi and a:,, . 

Wc sluill shortly be able to describe this procedure of Wald’s as a special case 
of a more general procedure, but we shall first go back to the simplest one dimen- 
sional case to explain some of our notions and terminology. 

Consider the uniform distribution from 0 to 1, draw a sample of n, and let the 
sample values, ordered according to size be b , 4 , • • • , <n . These n values di- 
vide the interval from 0 to 1 into the following n -f- 1 parts (0, 4), (4,4), • • • , 
(4 , 1) which we shall call blocks. Since the joint distribution of the 
4 is well known, that of the lengths of those n — 1 blocks is easily found. This 
distribution of lengths would be unimiiortant, if it wore not at the same time the 
distribution of the fractions of the population covered by the blocks. As is 
shown later, this distribution of fractions covered, or, more simply, of coverages, 
has the following properties: 

(i) the fractions covered add up to 1. 

(ii) the distribution is completely symmetrical. 

Property (ii) makes intuitive the result of Wilks [3, 1941] that the distributions 
of the coverage of regions obtained 

(a) by removing the h + h left-most blocks, 

(b) by removing the h left-most and the h right-most blocks 
are identical. The specific distribution obtained satisfies 

(iii) if the coverages are taken as barycentric coordinates on an ?i-simplex, 
the distribution over the simplex is uniform, 

(iv) the sum of the coverages of any k preselected blocks of the n -{■ 1 has 
the well-known distribution 

Pr {sum of k coverages < t] = 7( (n — fc +■ 1, k) 

where in, m) is the incomplete Beta function. 

We shall call a set of blocks, derived from a sample, whose coverages behave in 
this general way a set of staUsticaUy eqmvalent blocks. Normally this will be 
abbreviated to se-blocks. (A precise definition is given in section 4.) 

We shall concentrate much of our attention on all the blocks and their sym- 
metrical character, rather than on the tolerance region formed by deleting k 
of them, since our results will then be applicable to many other problems. 

Now wc can generalize Wald’s original procedure. Let Wi , TVs , ' ■ ■ , Wn 
be a sample of n — ^we shall not need to consider its distribution- — and let (pi , 
<pi , ■ ■ • , <Pn he n numerically valued functions of W, possibly alike, possibly 
distinct, such that <pi{W), (P 2 (W), ■ ■ • , <p„{W) have a joint distribution. Proceed 
as follows: 
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Order the W, according to the numhers viiW,), select the W, for which 
is laigest and denote it by . The first block contams all T'F such that 

(2da) ^i(TT) > 

Discarding IF.(i) , order the remaining TF. according to the values of <P 2 (W,), 
and select as TF,( 2 ) the one giving the largest value The second block contains 
all TF such that 


(2.1b) 

Continue this process. 
(2.1m) 


<pi(lF) < <pi(lFi(t)), 

¥52(F0 > V2(1F,(2)). 

The mth block, for m < n will be defined by 

j = 1, 2, •••, m 

<Pw(W) > <pmiWr(m)), 


and the (n + l)st block by 


1 , 


(2.1n) sOiCl'F) < j = 1,2, ■ ■ ■ , n. 

(A graphical example of this construction is given shortly.) This set of n. -f 1 
blocks will be statistically equivalent whenever the cumulative distribution of 
each (pi function is continuous 

' To specialize this to the case described above, let TF be a pair (a:, y) of numbers 
and let 

(i) the first ki<p’a be the a;-coordinate of TF, 

(ii) the next h <p’s be minus the a:-coordinate of TF, 

(iii) the next hip’s be the ^/-coordinate of TF, 

(iv) the next ki<p’s be minus the i/-coor(hnate of TF, 

(v) the remaining <p’a be arbitrary. 

Then the first fci blocks will contain all TF for which 


X = <!!j(TF) > ¥>,(TF,(y)), j — 1, 2, • • • , h 


that is, for which 


X ^ iCu — ^fcj(TF • 


Similarly, the next fca + + h blocks will contain all TF with 


X <Xi, 

y > Vu, xi <x < Xu, 

V <yi, Xi< X < Xu, 

respectively, and the removal of these h + h + h + h blocks leaves Waldos 
tolerance region (plus the boundaries where x = Xu,x = xi ,y = yu,y = yi)~ 
There would be no point in this more general wording, if it did not include 



632 


JOHN 'SV. TUKBY 


new cases of some interest. We give now, in graphic terms, an example of such 
a case. 

We deal with a sample of n bivariate observations, which we think of as plotted 
on a map so that wc can use geographical language. The number n is rather 
large, and we wish to construct a tolerance region by deleting 12 blocks. We 
proceed as follows: 

Find the most northerly point, draw an East- West line through it, and shade 
the area North of the line. Find the most easterly point in the unshaded area, 
'’r w a North-South line through it, and shade the unshaded area East of the 



Pia. 1 

ine. Find the most southerly point, (always working in the unshaded area) , 
draw an East- West line through it and shade the area South of the line. Find 
the most westerly point, draw a North-South line through it, and shade the area 
West of the line. Find the most northeasterly point, draw a NW-SE line through 
it and shade the area northeast of the line. Find the most southeasterly point, 
draw a NE-S W line through it, and shade the area southeast of the line . Repeat 
this 6 times more, choosing in succession the most southwesterly, northwesterly, 
northerly, easterly, southerly, and westerly points, The remaining points will 
now lie in an unshaded area surrounded by a polygon, winch will have 8 (or 
perhaps fewer) sides. The inside of this polygon is the desired tolerance region. 
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Figure_ 1_ shows the final result, starting from n = 25. The practicing statis- 
tician IS invited to try an example of his own with n at least 100 

Other newly accessible cases can easily be invented by the reader, after he con- 
siders this example carefully. 

The use of a single W and n functions <pt has two virtues ; it simplifies nota- 
tion and frees the intuition, as compared with the use of n chance Quantities 

= <p<m. _ 

If the bivariate situation above were regarded as a 12-variate situation, where 
the variates were, in order, [y, x, -y, -x, x + y, x ~ y, - x - y, - x + y, 
V, X, — y, - x) then the original Wald procedure with ki = h = ■ ■ ■ = = 

1 ; fe = /c4 = ■ ■ • = A:24 = 0 would apply to construct the same region. Yet 
even if x and y had a bivariate normal distribution, Wald’s proof would not 
apply witliQut extension For the 12-dimensional distribution is highly singular 
(it is concentrated on a 2-dimensional plane in 12-dimensional space) and there 
is no hope of a density function. An extension of Wald’s result to the case 
where the 12-dimen6ional joint cumulative distribution function is continuous 
— as is the case in this example when x and y have a continuous joint cumulative 
— is clearly needed. 

When we come to deal with the case of where the cumulative needs not be 
continuous we shall meet a further difficulty, namely “ties”. But if, as m the 
present case, the cumulative is continuous, it is easy to see that the probability 
that (ptOVi) = for any ^, k is zero. 

3. Terminology and notation. A quantity which has a probability distribu- 
tion we call a chance quantity (it has frequently been called a random variable ) . 

. The term chance quantity does not imply that its values are single real numbers, 
they may be single real numbers (when we also speak of a real chance quantity), 
sets of n real numbers, or more general objects. The cumulative distribution 
function, or cumvlahve, of a single real chance quantity, X, is defined by 

F(<) = Pr{X < «}, 

except perhaps at the discontinuities of F. We have used here the notation 
Pr{k{X) 1 to indicate the probability that k{X) holds, and we have followed our 
policy of using capital letters for chance quantities and the corresponding 
lower case letters for their values. 

The set of values of W, or, as we shall say, the TF-set, for which, for example 
<p(W) ^ 3, will be denoted by 

{Wl^(TF) <3}. 

We shall wish to compute probabilities associated with one or more functions 
of a chance quantity; usually we will emphasize that these functions shall be 
measurable with respect to the probability measure underlying the distribution 
of W by asserting that they have a joint cumulative, which is defined by 

F{ti , fa , ■ ■ • , fe) = PT{<Pk{W) < k}, 
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(except possibly at discontinuities of F) and which does not exist unless the <Pi 
are measurable with respect to the unknown underlying distribution of W. In 
cases where avc neglect to remind the reader, it is still assumed that the functions 
ai’e measurable. 

The coverage of a 17-set, which may itself be a chance quantity, is defined by 
Coverage of iS == Pr (17 e iSf). 

When )S is a chance quantity, its coverage is also a chance quantity. The 
barycentric simplex (of dimension n) is the set of points in n + 1 -dimensional 
Euclidean space ■ ■ ■ , ^h+i) ivith + • • ■ + = 1 and 0 < t, <Jl. 

The name comes from the representation of the point (ii , fe , • , tn+i) as the 

center of gravity (in mechanical terms) or mean (in statistical terms) of the dis- 
tribution where a fraction k is concentrated at the -itli vertex. (In order, the 
vertices are (1, 0, 0, ■ • ■ , 0), (0, 1, 0, • • • , 0), etc.) The uniform distribution 
on this simplex has an (n-dimensioual) density 

nHtidti ••• din , (0 < ii , k , - • • , t„ , 1 — k — k — k < 1), 

and the cumulative 


T(xi , 212 ) ' ' ' , 21„+i) 


nl 


//■■•/ 


dk dk • • • din 


where the integration is over the range where 0 < U < Xt and at the same time 
h + ^2 + ‘ + ^n-l ^ !■ 


4. The blocks determined by n values of T7. We deal now with a population * 
of T7’s (a probability measure a on the space T = {ui}), a family of functions 
¥’1 j ^ 2 , • ■ ■ , Vm of 17 with a joint cumulative (measurable with respect to^n) 
and a set of values toi , W 2 , • • • , , (w, « T). 

(4.1) Definition The set wi , Wi , ■ ■ ■ , w„ and the funcUms <pi , (pi , • • • , vm 
define blocks as follows: 

(4.2) Si = (la I <piiw) > oi} 
where ai = max (pi{w^ = V5i(ia.(i)), which defines ji(l)- 

i 

(4.3) S 2 ■= (ta I ipfw) < ai , <p2(w) > oz], 

where 02 = max ip 2 (ia,) = ipiiw^)), i{2) ^ »(1), which defines i{2). And in gen- 
laui) 

eral, for 1 < k < min (m, n), 

(4.4) Sk = {w\ <f>i{w) < ax, ■■■ , vk-i(w) < ttk-i , <ph{w) > a»), 
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whsTe <tk max (pk(w,) ~ the maxiinuvi being talcen over all i except 

^(1)) ^'(2), • • • , i{h — 1); and t(A:) being chosen distinct from all i(j), j < k. 

If m > n, then 

(4.5) S„+i = (w I <pi(w) < at, , <pn(w) < a„). 

If m < n, then 

(4.6) *Sm|7i+l = {W I lfii(w) < at, ••• , ipmiw) < Om}. 

■The result of this definition is to use Wi, ■ ■ - , and </>i , ■ • ■ , to define 
n + 1 blocks (one more than there are w’s) in case there are enough functions, 
and, in case there are not enough functions, to define one small block, S , , for 
each function plus one large remainder /Simin+i . We notice 

(4 2) Remaek. The blocks of (4.1) are well defined unless <pi{Wj) = <p,iwk) for 
some i, 3 , k. 

6. Statement of results for the statistician. The central results can be stated 
as follows: 

(5.1) Theobem Am\n +1 • If Wi , Wi , ■ ■ ■ , Wn are a sample of n from a dis- 
tribution, if <pi, ‘pi , • •, <Pm , (m < n), are m functions such that 

<Pi{W),.pi(W),-- - ,<pUW) 

have a joint distribution which has a continuous cumulative, and if the blocks 
Si , Si , • • • , S,n and Sm|„+i are defined as in (4.1), then 
(i) the blocks are disjoint chance sets, uniquely defined with probability one, 
Oi) the distribution of the coverages 

Ci = Pr\w in (S,}» i = 1,2, ■ • • ,m 

and 

Cm|n+i = Pr {w in S„|„+i) 

IS the same as that of ti , ti tm and tm+i d" tm+i d" • • ■ H" ^m+i where U 

are uniformly distributed on the barycentnc simplex mth n d- 1 vertices. 
Conditions (5.1i) and (S.lii) are the precise definition of a partial family of 
statistically equivalent blocks of type n+ land an associated (m ] n -f 1) tolerance 
region. 

(5.2) Theorem P„+i . IfWi,Wz,---,Wnarea sample of n from a distribu- 
tion, and if <Pi,<Pi, ■■■ , <P‘'> ,im>n), are m functions such that 

<PiW), ‘FiW), • • • > 

have a joint distribution which has a continuous cumulative, and if the blocks 
Si, Si ,••• , S„+i are defined as in (4.1), then 

(i) the blocks are disjoint chance sets, defined with probability one. 
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(ii) the distnhuUon of the coverages 

Cl = Pr {w in £f.}, i = 1,2, ■ ■ ■ , n + I 

IS the same as that of U ,U , • - • , t„+i , where the i, are uniformly distributed 
on the bary centric simplex with n + 1 vertices. 

Conditions (5,2i) and (S.2ii) are the precise definition of a complete family of 
statistically equivalent blocks. In Paper III avc shall have to Aviden these notions 
a little, and this form will then he qualified Ijy the phrase “in the narrow sense”. 

6. Statement of restilts for the measure theorist. The construction of (4.1) 
maps the product T" X 17" into where T is the set of w’s (and hence T" is 
the set of ordered n-tuples of w‘s), U is the space of all real-valued functions 
defined over T, measurable AA'ith respect to a fixed probability measure y, and 
possessing a continuous cumulative, (i.e. I <piw) = c)) = 0 for all real c), 
and hence U" is the space of ordered n-tuples of such functions, and E''+^ is 
Euclidean n-dimcnsional space. More precisely, the mapping is into the bary- 
centric simplex with n -f- 1 vortices, a subset of and is well defined except 
for a set in T" of measure zero Avith respect to /i“, the power measure of y. In 
these terms, we may restate theorem B as folloAvs: 

(6.1) Theorem I?„.h . Hold the n functions , • • • ,<pn and the probability 

measure fixed, then T" is mapped into and the power measure y" is carried by 
that mapping into a measure on Bn . This measure is always ?il times Lebesgue 
measure. 

7. Wald’s principle. The essential principle behind Wald’s process of dis- 
carding observations is sufficiently fundamental to warrant a name of its own. 
It can be stated^ quite generally, in the two following forms: 

(7.1) Wald’s Principle, (discrete form) Let W be a chance quantity, 

and consider samples of n. Fix disjoint w-sets Ai , Ai , • • ■ , Am , B. Consider 
those samples of n for which exactly one value falls in each A, and the remaining 
n-m fall in B. The distribution of the n-m falling in B is that of a random sample 
of n-m from the distribution of W restricted to B . {j.Q. yaiX) = [/i(5)]“V(.BZ).) 

(7 2) Wald’s Principle (conditional form.) Let W be a chance quantity, 
and <p a function such that each value of <p(W) has probability zero. Consider 
samples of n. Then the conditional distribution of the w, , given that 

max (p{wi) = a, 

% 

is that of one wta with (p{Wif) = a and a sample of n~l other Wi from the distribu- 
tion of W restrieted to B = (in | v(w) < a}. 

(7.3) Central Lemma. Let W be a chance quantity and let tpy ipn be 

functions with a joint cumulative such that <pi{w) = a has probability zero for each 
i and a (i.e. the joint cumulative is continuous ) . Then the conditional distnbu- 
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tion of the remaimng n — k w’s, after k blocks have been chosen according to (4.1) 
is that of a sample from the distribution of W restricted to 

B = {w\ ipi.{w) < tti , • ■ • , < Ok] , 

where k = 1, 2, ■ ■ ■ , n. 

The proofs of these statements ai-e elementary and direct. To establish (7.1) 
we have only to show that given two sets in their probabilities on the 
assumption that one w, is in each Ai are in the ratio of their probabilities for an 
unrestricted sample oi n — k But the probability of finding the n— 7c w, in a 
set R, contained in B"~^, and one to, in each A , , is exactly 

times the probability that n — kwi, known to be in 75"“'', will fall in R. This 
establishes (7.1). 

In order to prove (7.2) we must show that the probability of a set R of n- 
tuplcs Wi ,Wi , • ■ ■ , n>„ is the same whether calculated directly or calculated by 
the proposed conditional distribution. To this end, it is natural to decompose 
R as follows; 

R = R(l) + R{2) + • • • + Rin) + Z, 

where R(i) contains those («!,•• , Wn) in R for which <p{w,) > <p{wi) for all 
j 7 ^ f , and Z contains the remaining (wi , • • • , w») , which must mvolve at least 
one tie <piwf) = <p{wk), J ^ k Since Z has probability zero, it will suffice to 
establish the equality of the two calculations for sets of the form R{i), and be- 
cause of symmetry we may restrict ourselves to sets of the form J?(l) . 

Given an integer N, we decompose the range of ip{w) into Nn segments of equal 
probability, which we may do because the cumulative of ip is contmuous. There 
are then Nn values 6* , (5o = — <» ,&«•« = + '®) such that 

Pr {5it_i < <piw) < 5)b) = 1/Nn. 

We now decompose our set R (which is of the form B(l) as follows: 

R = Ri ' "h Rrfn H" T, 


where Rh contains those n-tuples 

(wi, ■ ■ • , Wn) for which 6*_i < p(wi) < bk 

and <p(wi) < bk-i for all f > 1. The remaining set Y contains n-tuples where 
the two largest (i = 1 and f = O, belong to the same interval. The 

probability of this is less than 


n 


'in - 1) (±\ < J_ 

“''2 \nN/ - 2hr^ 
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as calculated from the Icnown distribution. Calculating from the conditional 
distribution, we find immediately a bound of 



where A„ is a constant depending only on n. Thus, as N increases, the prob- 
ability of the successive sets Y tend to zero — calculated either way. To show 
the equivalence of the two calculations it is now suflicient to show that they 
agree for the sets Rk . But this is a case of (7.1) and the lemma is proved. 

Now (7.3) folloivs by induction, applying (7.2) at each step. 

8. Proof of theorems. We notice that Theorem B n is equivalent to Theorem 
, since, according to (4.1) »Sn|»+i = <S„+i . 

We have only to prove theorem Am|n+i , which wo do by induction on m. 
For m = 1, it is exactly Wilks’ [3, 1941] original one-dimensional theorem, and 
is known. Lot us assume it for m = h and demonstrate it for m = /c + 1, for 
by induction this will complete the proof. 

We must deal with the blocks Si , Si , • ■ • , Sk , Sk+i and »S*+i|n+i > (notation 
as in (4.1) and (6,1)). We need the obvious 

(8.1) Lemma, Since the cumulative of is continuous, the union of Sk+i 
and & 4 i|m+i differs from iSrimHi hy a set of zero prohahility. 

Hence _ , 

C/i|n41 = C*+i|„ 4 X . 

Since we know from the induction hypothesis that ci , Cs , ■ ■ ■ , Ck and ctf„+i 
have the correct joint distribution, we have only to show that Ch+i and Ci , 
a, ■ ■ ■ , Ck have the correct joint distribution. Fix ci , cz , ■ • • , c* . Then 
Qi, at, Uh must be fixed, and so (7.3) applies to the n—k wts not dis- 
carded after Oj , uj , • ■ • , 0 *, have been fixed. The conditional distribution of 
ca +1 must bo that of a fixed number (1 — ci — Cj — • ■ ■ — c*,) ) which is the 
probability attached to ;S/.|n+i , times the coverage of one block based on a sample 
of n—k, since the remaining n~k w’s behave like a sample. 

Consider the very particular case where w is uniformly distributed between 
zero and one and <pi(w) = w, all that we have said in the last paragraph applies 
— the codditional distribution of given ci , cj , - • , c* is the same in the two 
cases — hence the joint distribution of cj , Cz , • • • , c* , c^+i is the same in both 
cases — ^but in this very particular case the joint distribution is known to be 
that required by theorem . 
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SOME BASIC THEOREMS FOR DEVELOPING TESTS OF FIT FOR 
THE CASE OF THE NON-PARAMETRIC PROBABILITY 
DISTRIBUTION FUNCTION, I 

By Biudford F. Kimball 
State Department of Puhlie Seroice, New York, N. Y. 

1. Summary. In developing teats of fit based upon a sample Onixi) in the 
case that the cumulative distribution function F{X) of the universe of X’s is 
not necessarily a function of a finite number of specific parameters — sometimes 
known as the non-parametric case — ^it has been pointed out by several writers 
that the "probability integral transformation” is a useful device (cf. [l]-[4]). 

The author finds that a modification of this approach is more effective. This 
modification is to use a transformation of ordered sample values Xi from a random 
sample Onixf) based on successive differences of the cdf values Fixf). 

A theorem is proved giving a simple formula for the expected values of the 
products of powers of those differences, where all differences from 1 to + 1 are 
involved in a symmetrical manner. 

The moment generating function of the test function defined as the sum of m 
squares of these successive differences is developed and the application of such 
a test function is briefly discussed. 

2. Introduction. Let the sample values x, be ordered so that 

(2.1) S a:.+i , (i = 1, 2, * • • , n - 1). 

Let Ft denote the value of the cdf F(X) associated with the rth ordered sample 
value Xr . Thus 

(2.2) Ft = F{Xr). 

Consider the following transformation of the ordered sample values Xi based 
upon the (hypothetically) known cumulative distribution function F{X) which 
will be taken as a continuous function of X over its admissible range : 

Ui = Fi, 

(2.3) Ur = Ft - Ft-i , (r = 2, 3, ■ • • , n) 

Un (-1 “ 1 F f, . 

The restrictions on Fi are that 
(2 4) F, ^ Ft+i , and 0 g F, g 1 

The aj) 0 ve transformation (2.3) translates these conditions into the symmetrical 
conditions 

(2 5) 0 ^ u, , and Ui + U 2 + • • • + Ur, + Un+i = i. 

A one-to-one correspondence between w, and F, exists if one of the u, be omit- 
ted, — say Ufi . With up omitted, the Jacobian of the transformation from F, to u, 

640 
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has value unity. The probability density of the sample 0„(a:.) , with x, ordered, 
is given by 

(2.6) PlOnM] d0„ = n\ dFi dFi--- dF,, . 

Hence with omitted, 

(2.7) iP[On(‘^t)] dOrt “ dill dui * * ' dufi—i du^^x * * ■ dn^+i • 

The sample space of the Ui with omitted, is that portion of the w + 1 
Euclidean space of all the u, variables, bounded by the coordinate hyperplanes, 
which is on the projection of the hyperplane (2.5) upon the hyperplane = 0. 
This is a region in the w-spaee of the Ui with omitted, bounded by the coor- 
dinate hyperplanes and the hyperplane 

(2.8) Ux+ 1h-\- -f -b + • • • + Un + u„+i = 1. 

Thus the formal integral of the pdf of the «, over sample space is 

(2.9) n[ j" n J dui ■ • • du/j+i • • - du„+i = 1 

with 0 ^ Hi , and m bounded above by the hyperplane (2.8). 

It is now clear that both the pdf and the sample space of the u, (with 
omitted) are symmetrical in the rt, . This fact leads to complete symmetry of 
the joint distribution function of any set of u, , over f = 1 to w + 1 including , 
relative to the «, selected, Other interesting results are forthcoming. 


3. Basic mathematical theorem. Using the techniques associated with the 
Beta function, the expectation of the products of powers ut is found to be 

Elu^-u] ■<■■•] 

=r(n+ l)r(p-|-l)r(g+ l)r(w+ l) ••• /T(n-^p + q + w + •■■-1-1) 

where r s, t, etc,, are any set of different indices (for the present other than 3) 
from the integers 1 to n -b 1, and p, q, w, etc., are any real numbers greater than 
minus one. The relation (3.1) can further be generalized to the case where up 
may be included. This wdl be proved for the case n = 2, with p, q and ra 
taken as integers. The generalization can be concluded from inspection. Thus 

with 

Us ■= 1 — Ml — Ha , 


Ejuf ■Uj-Ws'] 


21 j ul dui I Mi’’(l - Ml - Mj)“’ dUx 

21 jT'mKI 

(p "b Ml “b 1)1 •'® 


21p lg’ic! 

(p -b 2 + Ml -b 2)1 


Hence the theorem: 
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Tbeokem. Given a random sample of n values of X from a universe mth cdf 
F{X) which is continuous over the range of X. With the sample values a;, ordered 
so that Xt ^ Xi+i define a set of n 1 variables m, as the successive differences of 
F{x^) by the relations (2.3) . The expected value of the product of i cal powers greater 
than minus one of any or all of the , (i = 1, 2, ■ ■ ■ , n + 1)) is given by the rela- 
tion (3.1) above (not subject to the omission of uf). 

There arc many interesting consequences of this theorem. Perhaps the most 
striking is the following: 

CoEOLLART 1. Let 0 range a(m, k) for positive integer m be defined by 

(3.2) a(m, k) = F(xk+^) - F(xf) 

mth fc = 0, 1, 2, ■ ■ • , n, and m g n + 1 — k 

under the convention 

F(xo) = 0, F(x^+f) = 1. 

The probability distnhution of a(m, k) is independent of k and hence is the same as 
that of F(xm)- 

Another interesting consequence (not new) is the following: 

Corollary 2. The correlation of u, and Uk , i 9^ k, is the same for all pairs 
(i, k) over the range of indices from 1 fo n + I, and has the value —1/n. 
Introducing the notation 

(3.3) [n + r]^ = (n + r)(n + r - 1) • • • (n + 1), 
the corollary follows from the rclationsliips 

E(ui) = l/(n + 1), E(ul) — 2/[n + 2 ]^ , B(uiak) ~ l/[n + 2 ]^ . 

The fact that the correlation between any two frequency differences u,- and u* 
is negative leads to the following more general relationship : 

Corollary 3. For any set of different indices i, j, k, etc., and for any positive 
numbers p, q, r, etc., the expectation of the product of the powers p, q, r, • ■ ■ of 
Ui , Uj , Uk • ■ • IS less than the product of the expectations of the powers taken 
separately: 

(3.4) E[uf ■u)-u'k---]< E(uf) ■ E(u]) • E(ul) ■ ■ • . 

This follows from generalization of the relation 

r(n + i)r(p + i)r(g + i)r(r + i) 

r(n + }> + g + r + l) 

[r(n + i)]'r(p + 3)r(g + i)r(r + i) 
r(n + p + l)r(E + g + l)r(n + r + l) ‘ 

The above theorem suggests the possibility of test functions for fitted distribu- 
tions, relative to a universe with a cdf which, since it is merely conditioned by a 
sufficient hypothesis for the theorem, may be of the non-parametric type. 
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A teat function of the form 

= Sw,”, p real and positive 

m 

imght first come to mind. If p = 1, compensatory eSects of deviations reduce 
the elficiency of the test lunction. One is thus led first to consider the teat 
function (3.6) for the ease p = 2. 

4. The moments of the probability distribution of == s u® . We are 
first concerned with the problem of the determination of the moments of the 
function 

(4.1) 2/n. = 52 

m 

where i ranges over any particular fixed set of m integers which for simplicity 
is usually taken as the first in. 

One first recalls the fact that the result is independent of which m indices have 
boon selected, and that the expected value of any combination of powers is 
independent of which specific subscripts of u, are involved. 

Since the u,- are correlated, principles of combinatory analysis are involved in 
determining the momenta of j/*, . One possible way of obtaining the moments 
is as follows: 

Lot Vr denote the rth moment of ym about ym = 0, Thus 

(4.2) = 

m 

Now in the expansion of (52 ''AY i the sum of the power indices of each term 

m 

is 2r. Thus referring back to (3 1) and (3.3) it will be noted that the expected 
value of each such term will have the common factor 

l/[n + 2r]2r . 

Consider a general term of the expansion of (52 

m 

Or,T, . r* • uYWA ■ • • ^th n + r2 + ■■■ + Tk = r. 

Clearly 

EiuTSi? ■ ' • ‘•AY) = 2ri ! 27-2 1 • • • 2r* \/[n + 2 r] 2 . . 

and the coefficient CriTf-tk is Iho multinomial coefficient 

r! 

C'r.r,. T* = r^\r2\---Tk\' 

Now in the expansion of (52'^D'^ group the terms which have the same set of 

h values of r, , iiTCspocLivc of which indice® of iti arc involved. The number of 
such terms (since ('acli ini'olvcs k diffcrenl indices) is (&). If ri , r 2 . ■ ■ ■ , r^. , 
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are ajl different each combination could be taken in k ! different ways. Thus with 
r’s all different and fixed, the sum of all coefficients of terms with same combina- 
tion of 2r, powers (irrespective of variation of indices of the ui) is 



This would then constitute the total multiplier for 

2ri I 2r2 1 • • • 2}-* l/[n -j- 2r]sr 

for a given set of k r’s which are all different. 

If some of r’s are repeated, lot 7ei , , • • • , k, denote the number of repetitions 

of each different r, (la ^ 1, and ki + h + ■ ■ • + k, = k). Then each com- 
bination of the k r’s corresponding to a set of k products could be taken in 

/c!/(/cii 7 c 2! • • • fc.l) 

different ways. Hence the lemma; 

Lemma 1 . Consider all admissible sets of k different subscripts of u* and a fixed 
set of values of r = ri , ri • , r*, where 

n + ri + ••• + rk = r 

such that s of those r’s are different, and the number of repetitions in the set of r’s is 
given by ki h • k, (k, 1, and h + kt + • • • + k, = k). The composite 

coefficient of the terms in v, involving the factor 

2ril 2r2l • • • 2rkll[n -f 2r]2, 

is given by 

m\ ^ I 

k / kilkil ■ • ' k,\ rilra! ■ • • r^r 

Examples of computation of Vr by means of the above lemma. The first order 
moment is given by 

(4.4) Vi = — m 2\/[n + 2]2 . 

m 

The second order moment is given by 

Vi — .E[(y^,ttj)^ = CiE{u*) -fi C2E{u^iU^j), 

m 

and determining the values of C{ from Lemma 1, 

or 

(4.5) Vi = [^m4! -f- 8 Q] / [« + 4]4 = [m -|- 4 ^ ‘ 
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Again for the third order moment, 

.. - ^[Ca:)-] - C.BM) + cjsiuVi + o.b(.;.|4), 

and using Lemma 1, 

= [m 61 + (^2) ~ ~ 2141 + 212121 / [n + 6 ]. 

= [m6l +(2)213141 21212131]/ [R + 6]e 

or 

(«, -[» + (”)| + ©^]/("t')- 

Similarly writing the fourth moment in the form 

V, = CM) + CMu]) + cMu;) + cMuy,) + CMyulu]) 

and using Lemma 1 it reduces to 

•■ -[»+(“)!+ ©I +(r)^(:)i]/(nO- 

Higher order moments of the probability distribution function may be com- 
puted as desired. 

An alternate method of computing the moments of the distribution of this test 
function is the following: 

Consider a function goix) such that 


(4.8) 

"'"f - (2r)., ,.(0) - 1. 

Thus 


(4.9) 

= [d'goiO)/dx']/[n + 2r]2r 


From the principles of combinatory analysis of linear operators, it follows that‘ 

(4.10) S[(i: «?)'] = /[n -{- 2r]=. . 

Although this is an enlightening analytical form, actual computations seem to be 
simpler with the use of Lemma 1. 

^ On© way of seeing this i© to first think oi the m, tis staliBlically intlcpoiident. Th© 
numerators of the resulting terms would be the same as in (4.10). l\hon the u, sre taken 
as dependent, by virtue of (3.1) the numerators will remain the same while all denominators 
will reduce to [n + 2r]tr . 
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Moment generating function. The moment generating function of the prob- 
ability distribution of ym can be wntten as 

(4.11) E{d'‘) = Go(i, m) = 1 + Z [df(si,{xTJdx'' \ *_o]/[n + 2r]^rt^ /r\ 

r—1 

with 

g,ix) = 1 + 2 ! a; + 4 I x^f2\ 4- 6 I a:V31 + • * • + (2r) 1 a;7rl + ■■■ 

[n -f 2 r] 2 , = (n -f 2r) (n + 2r — 1) • • • (n -f 1) . 

Although go(x) exists only as a formal power series, Gait, m) is defined by (4.11) 
as a power series with positive coefficients, converging for all t 


S. Some comments on test function, p = 2. At the present time the study of 
the test function for p = 2 has not gone far enough to justify publication of re- 
sults. One difficulty is lliat although its asymptotic distribution function ap- 
pears to be normal, the convergence towards normalcy may be extremely slow 
in some cases 

Furthermore there are indications that the case m ~ n + 1 will give the most 
definitive results not only because the complete range of data is used, but also 
because errors of Type II would in geneial have a less erratic effect. 

For the case m = n + I the mean, variance and third and fourth reduced 
moments (i.c. moments about the mean divided by corresponding power of <r) 
arc; 

Case m = n + 1. 


Eiy„+f) = 2/ in + 2), = 4n/l(n 4- 2)\n + S) in + 4)], 

lOn — 4 


Oia = = 


( 6 . 1 ) 


in + 5)in 4- 6) 




' jn + 3)(n 4- 4) 
n 


= 


n® 4- 4- 14 tc - 8 

'3in 4- 3)(n + 4)"| 

_(n 4- 6)(a 4“ 6)(?i -b 7)(n. -f- 8)_ 

^ n _ 


_ 6(41n* 4- 241n^ 4- 118n’‘ - 78471 - 48) 

nin 4- 5)(n -f 6)in 4- T)in 4- 8) 


If data is not grouped the test may be applied as follows: Given a function 
QiX) which has been fitted to the cdf F(X) . Front a random sample of size n 
with X, ordered as in (2.1) compute the successive dilTeronces of Qix,) to obtain 
the variables u*. Then consider the sura of the squares 

17* = Z nV. 

tI+1 


If QiX) is a true representation of F(X) the variation of U* will follow that of 
y„.^.l . Thus the expected* value of U*, its variance etc. will be independent of 
the fitted function Q(X), which represents certain advantages over the f^st. 
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The effect of Type 11 errors can be roughly analyzed as follows: In considering 
the effect of such errors the testing procedure must be criticized from the point 
of view that 


Q(X) F(X). 

For m = n + 1 it still is true that 

2 w.* = 1 

which tends to act as a control upon U*. For example set 

Uf — Ui -|- xi . 

Then from the above relation it follows that 
(6.2) 2x.- = 0 

Write U* as 

U* = + 22m, x. 

fc Q\ 

= 2u? + 2x! + (22x.)/(n + 1) + 22xiS(M.) 

where 5 (m,') denotes the variation of the true frequency differences from their 
expected value 1/in + 1). 

The variation 5(u,) will be to a considerable degree independent of x« . Thus 
the term 2x? will in general tend to be larger than the last term on the right. 
The third term on the right will be zero by virtue of (6.2) , and hence U* will tend 
to be larger than y^+i . A similar effect upon the sampling variance of U* can 
be noted. Hence an interval of rejection 

U* ^ A, P[yn+i ^ A] = a = confidence level, 


is pointed to. 

On the other hand if mi < m + 1 the condition (5.2) no longer holds, the term 
(2 2x <) /{n + 1) of (6.3) will not be zero and in many cases would dominate the 
other two error terms. Thus it is easily conceivable that (|ne may have in the 
case OT < n + 1 

Vl <ym 

even when the discrepancies xi ^re large. Hence in the case w < n + 1 choice 
of confidence interval will require considerable care (see [1]) 

Although the distribution of j/n+i for small n is decidedly non-normal, if the 
test function is replaced by 

(5.4) r„+i = (S[m, — 1/(m + 1)]“)* 

it will be found that the probability density function takes on the normal charac- 
ter quite rapidly with increasing n. Indeed the author has found that a coin- 
puted approximation to the probability density function of r„+i with n = 4 is 
decidedly normal in character. 
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AN ESSENTIALLY COMPLETE CLASS OF ADMISSIBLE DECISION 

FUNCTIONS 

By Abraham Wald 
Columbia University 

SuiDutnary. With any statistical decision procedure (function) there will be 
associated a risk function r(6) where r{e) denotes the risk due to possible wrong 
decisions when 6 is the true parameter point If an a priori probability distribu- 
tion of 8 is given, a decision procedure which minimizes the expected value of 
r(d) is called the Bayes solution of the problem. The main result in this note 
may be stated as follows: Consider the class C of decision procedures consistmg 
of all Bayes solutions corresponding to all possible a priori distributions of 9. 
Under some weak conditions, for any decision procedure T not in C there exists 
a decision procedure T* in C such that r*(fi) ^ rid) identically in 6. Here r(0) 
is the risk function associated with T, and r*(fi) is the risk function associated 
with T*. Applications of this result to the problem of testmg a hypothesis are 
made. 


1. Introduction. In some previous publications [1], [2] the author has 
considered the following general problem of statistical inference:^ Let 
X — iXi 1 ' • • ) ^n) be a set of chance variables. Suppose that the only infor- 
mation we have concerning the joint distribution function F of these chance 
variables is that F is an element of a given class 0 of distribution functions. 
Suppose, furthermore, that a class D of possible decisions d is given one of which 
is to be made on the basis of an observation x = [xi,-- - , ®n) on the chance 
vector X. The problem is then to construct a function dix), called statistical 
decision function, which associates with each sample point x an element d{x) 
of D so that the decision dix) is made when the sample point x is observed A 
statistical decision function dix) is defined over all possible points a: of the sainple 
space and for each sample point n; the value of the function is an element ot D. 
Each element d ot D will usually be interpreted as a decision to accept the 
hypothesis that the unknown distribution E of X belongs to a certain subc ass 
0 ) of Q. Different elements d of D correspond to different subclasses w of 
The problem of testing the hypothesis H that the unknown distribution unc- 
tion F belongs to a given subclass « of fi, is contained as a special case m the 
above generd problem. The space D will then contain only two elemente 
di and d, , whore di denotes the decision of acceptmg H and di denotes the 

[1] md'i^we^shall assume also here that fi is a fc-parameter famUy of 
distrMonXnctk,ns. Then each element of fi may be represented by a, point 
Jttr .“led parameter point, in the fc-dimensional Cartesmn space. 
The class fi is then represented by a subset of the /c-dimensional Cartesian space, 

649 
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called parameter space. We shall, therefore, refer to 12 as the parameter space 
and to its elements as parameter points. 

The merits of any particular decision function dix) will usually depend on 
the relative importance of the various possible errors caused by not selecting 
the proper element d of D. The relative importance of such errors has been 
described in [1] and [2] by a weight function W(6,d) defined over the product of 
12 and D. For any pair (0, d) the value of 1F(0, d) is non-negative and expresses 
the loss caused by taking the decision d when 6 is the true parameter point. 
For any given decision function d(a:) the expected value of the loss is given by 

(1.1) Tie) = f W[e, dix)] dFix) 

Jm 

where M denotes the sample space and F(x) is the j oint cumulative distribution of 
X = (Xi , ■ ■ ■ , Xn) corresponding to the parameter point 9. 

The function r(e) is defined over the parameter space 12 and is called the risk 
function. The shape of the risk function r(e) will, in general, be affected by the 
decision function d(x) used. To put this dependence in evidence, we shall use 
the symbol r[9 | d(ii)] to denote the risk function r(0) associated with the deci- 
sion function d(x) . 

A decision function dix) is said to be uniformly better than the decision 
function d*(x) if 

(1.2) r[9 1 d(x)] ^ r[9 1 d*(a!)] 

for all 0 and if there exists at least one point 0 for which the inequality sign holds 
in (1.2), A decision function d(x) is said to be admissible if no other uniformly 
better decision function exists. 

A class C of admissible decision functions will be said to be essentially complete 
if for any decision function d(x) not in C there exists a decision function d*{x) 
in C such that 

r[0 1 d*(a:)] ^ r[0 | d(a:)] 

« 

for all 0. 

In section 2 we shall formulate certain assumptions w'hioh will then be used 
in section 3 to derive an essentially complete class of admissible decision func- 
tions, In section 4 applications are made to the problem of testing a hypothesis. 

In a recent paper Lehmann [3] obtained an essentially complete class of 
admissible tests for each hypothesis if of a certain restricted class of simple 
hypotheses- The restrictions imposed on £2 in Lehmann’s paper are essentially 
those formulated by Neyman [4], [5] to insure the existence of the type Ai 
(uniformly most powerful unbiassed) test. Our defimtion of an essentially com- 
plete class of admissible decision functions ag’rees with that given by Lehmann 
when the problem is to test a hypothesis and the weight function WiO, d) can 
take only the, values 0 and 1. 
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2. Assumptions. Throughout this paper we shall make the following as- 
sumptions: ® 

Assumplion i; The parameter space 0 is a bounded and closed subset of a 
finite dimensional, say /c-dimensional, Cartesian space. 

Wo shall introduce the following convergence definition in the space D: a 

sequence = 1, 2, ■ • • , ad inf.), of elements of D is said to converge 

to the element d of Z) if ^ 


lim W(d, dj = W(d, d) 


uniformly in d. 

Assumption g; The space D is compact and, for any d, W(d, d) is a continuous 
function of 9. 

Assumption 3: For any point ^ of 0 the joint distribution function of 
X = (Xi , • • • , Xn) admits a density function p(x, B) for all points x of the 
n-dimensional Cartesian space M (sample space). The density function p{x, 6) 
is assumed to be continuous in x and 6 jointly. 

In what follows we shall mean by a distribution function /(e) of 6 a cumula- 
tive distribution function for which / df{8) == 1 and for which f W(9,d)df(B) 

"D JQ 

is not zero identically in d. 

Assumption 4' For any point x of M, except perhaps for a set of measure 
zero, and for any cumulative distribution function f(B) there exists one and 
only one element of' B for which the expression 

(2.1) f W(d, d)p(x, e) dfie) 

takes its minimum value with reSpect to d. 

Assumptions 1 and 3 in this paper are exactly the same as Assumptions 1 and 3 
in [2], The formulation of Assumptions 2 and 4 is somewhat different from 
that given in [2]. This is mainly due to the fact that in [2] the space B has the 
same elements as Q, while here this is not necessarily so. It can be verified 
without difficulty that this slight modification of the assumptions does not 
affect in any way the validity of the results obtained in [2] Thus, we shall be 
able to molco use of any theorems proved in [2] for the purposes of the present 
paper. 

3. Derivation of an essentially complete class of admissible decision func- 
tions. For any distribution function fiB) defined over 0 and for any sample 
point X lot d{x, f) denote the element of D for v.hicli the expres-ion (2 1) takes 
its minimum value, It follows easily from the defimliou oi i (0) and d(x, f; that 
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for aay decision function d*(x) . If we interpret f(6) as an a prion probability 
diatnbution of 6, inequality (3.1) says that the expected value of t{6) takes its 
nunimum value for the decision function d(a:, /) . We shall refer to d{x, /) as 
the Bayes’ solution of the problem corresponding to the a priori probability 
distribution f{0). 

We shall now prove the following theorem. 

IhiEOEEM 3.1. The class C of all Bayes’ solutions d(x, f) corresponding to all 
possible a priori distributions f{9) is an essentially complete class of admissible 
decision functions. 

Proof. First we show that for any distribution fid) the decision function 
d(x, f) is admissible. Let d{x) be a decision function such that 

r[6 I <i(a:)] g r[6 \ dix, /)] 

for all 0. Then 

(3.2) [ r[d I dCr)] df{e) g f r[e\ d(x,f)] df(d). 

Jfl Ja 

From the definition of d(x, f) it follows that the equality sign must hold in 

(3.2) , i.e., 

(3.3) f r[0 I d(x)] dm = f r[0 I d{x, /)] d/(0). 

Jo Ja 

From the second half of Theorem 4.2 in [2] it then follows that 

r[0 1 d{x)] = r[e | dix, /)] 

for all 0. Hence dix, f) is an admissible decision function. 

We shall now show that the class C of decision functions dix, f) corresponding 
to all possible a priori distributions /(0) is essentially complete. Let daix) be 
any decision function not in the class C. The essential completeness of the 
class C is proved if we can show that there exists a distribution /(0) such that 

(3.4) r[e 1 dix,f)] ^ r[e \ doix)] 
for all 0. 

To prove (3.4) we shall consider the weight function 
(3.6) F*(0, d) = Wid, d) - r[0 | do(a:)] + Max 7-[0 | do(a:)] 

9 

The maximum of r[0 | do(a;)] exists, since according to Theorem 4.1 in [2] r[0 | di)(a;)] 
is a continuous function of 0. Clearly, Assumptions 1-4 remain valid if we 
replace F(0, d) by F*(0, d). Let r*[0 ] d(a:)] denote the risk function associated 
with the decision function dix) if the w'eight function is given by F*(0, d). 
According to Theorem 5.2 in [2] there exists a decision function d*ix) such that 

(3 6) Max r*[0 I d*ix)] ^ Max r*{6 \ d(x)] 

» t 
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for any decision function d(x). Since 

Max I do{x)] = Max r[e I d,{x)] 
it follows from (3.6) that 

(3-7) Max r*l6 \ d*(rK)] g Max r[e \ daix)]. 

Inequalities (3.5) and (3.7) imply 

(3-8) r[e I d*(a)] ^ r[6 | d„(a:)] 

for all 0. 

For any distribution J{6) we shall denote by d*{x, /) the Bayea solution of 
the problem corresponding to the a priori distribution S{6) when the weight 
function is given by W*(e, d). Since F+(fl, d) - W{e, d) depends only on 9 
but not on d, one can easily verify that d*{x, f) = d{x, /). It follows from 
Theorems 4 4 and 5.1 in [2] that there exists a distribution /(e), the so-called 
least favorable distribution, such that (3.6) remains valid if we replace d%x) 
by d* (••«,/). Thus we can put 

(3,9) d*(x) = d*ix,f) = d{x,f). 

Hence, from (3.8) wo obtain 

r[d I d{x, /)] ^ r[e I d|)(a;)] 

for all 6. This completes the proof of Theorem 3.1. 

4. Applications to the problem of testing a hypothesis. In this section we 
shall apply the results of the preceding section to the problem of testing the 
hypothesis II that the true parameter point is included in a given subset w of £2. 
We shall assume that a is an open subset of R The space D consists now only 
of two elements, di and di , where di denotes the decision of accepting H and ^2 
denotes the decision of rejecting H. 

We shall assume that the W{9, di) is equal to zero for points 6 m the interior 
or on the boundary of to, and positive elsewhere. Similarly, W(0, dz) will be 
assumed to be positive for points 6 inside w and zero outside w. For any a priori 
distribution /(O) the Bayes solution is given by the following test: We reject 
the hypothesis H if (and only if)^ 

(4.1) I W{6, dMx, 9) df{g) > f W(9, Ck)v{x, 9) df{9). 

Thus, the class C of regions (4.1), corresponding to all possible distributions 
f{9), is an essentially complete class of admissible critical regions. 

For any critical, region It we shall denote the probability that the sample x 

» Whether the equality sign is included or not in (4 1 ) is of no consequence, since by 
Assumption 4 the measure of the set of points x for which the equahty holds in ( 4 . 1 ) is zero. 
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will fall in R when 6 is true by P(0 \R). It follows from Lemma 4.4 in [2] and 
Assumption 3 that P{0 [ iJ) is a continuous function of 6 for any region B. 
Since W{0, di) is positive in the interior of O — to, and W{0, da) is positive in w, 
the class C of regions defined in (4.1) will have the following properties; 

(a) For any region R outside the class C there exists a region 72* in C such that 


and 


P{e 1 72*) g P{,d I 72) in w 
P((9 1 72*) > P(£l I P) in n - 6). 


(b) If R and R* are members of C such that 


P(fi I R*) g P{0 I P) in CO 

and 

P(5 I 72*) > P(0 I P) in - CO, 

then 

P(0 I P*) = P{e I P) for all 0. 

For any distribution g{6) consider the critical region consisting of all sample 
points X satisfying 

(4.2) f p(x, e) dg{6) > f p(x, 0) dg(e). 

•'Q— w 


Let C* be the class of regions (4.2) corresponding to all possible distributions g(6). 
One can easily verify that any region in C is also a member of C*. Thus, the 
following theorem holds; 

Theorem 4.1 Suppose that Assumptions 1 and 3 are fulfilled and co is an open 
subset of fi. Suppose, furthermore, that for any distribution p(0) the set of sample 
points X satisfying the equation 

f p(x, d) dg{e) = f p(x, e) dg[e) 


has the measure zero. Then, for any region P outside the class C* there will be a 
region R* in C* such that 


and 


P(0 I R*) < P{e I P) in CO 
P{9 I P*) ^ P(0 1 P) in n - CO. 


Addition at proof reading: After this paper was sent to the printer, the author 
obtained a generahzation of Theorem 3.1 to sequential decision functions, as weU as 
some other results. They will appear in a forthcoming issue of Econometnea. 
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DISCRmmATING BETWEEN BINOMIAL DISTRIBUTIONS 

By Paul G. Hoel 
UmversUy of Cahjonua at Los Angeles 

1. Summary. Given a set of k random samples, xi, X 2 , • • • , a:* , from a 
binomial distribution with parameters p and n, it is shown that the familiar 
binomial index of dispersion 

h 


("a:. — xf 

1 



yields an approximate best critical region independent of p for testing the 
hypothesis n = no against the alternative hypothesis n > no , provided x and 
no — X are not small. Because of the nature of the test, its optimum properties 
also apply to testing whether the data came from a binomial population with 
n = no or from a Poisson population. 

2. Introduction. A problem of considerable interest in certain fields is that 
of deciding whether a set of observations should Idc treated as having come from 
either a binomial population or from a Poisson population. Although there was 
much discussion a few years ago concerning the best method for making such a 
decision [1], [2], [3], no solution of the problem was presented. In this paper a 
test that possesses certain optimum properties is derived for discriminating 
between two binomial populations. This test, however, is also capable of solving 
the problem of how to discriminate between a binomial and a Poisson population. 
The methods that are employed in the derivation of this test are similar to those 
of an earlier paper [4] in which the problem of discriminating between two Poisson 
populations was studied. 


3. Similar regions. Let n denote the number of trials and p the probability 
of success in a single trial for a binomial distribution Let Xi ,X 2 , ■ • • ,xje repre- 
sent the observed frequencies in k random samples from this binomial population. 
Now consider the two alternative hypotheses 


and 


Ho : n = no , p = po 


Hi: n = ni> no, p = pi. 


The purpose of this paper is to construct a test for discriminating between the two 
values of n regardless of the values of p; however it is convenient to begin with 
these more restrictive hypotheses 
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For the purpose of hiiding a critical region for testing Ho against Hi , the x,- 
will be treated as the coordinates of a point in k dimensions. The probability of 
obtaining the particular point xi , ■ , xt when Ho is true will be denoted by 

Po [xj. Since the probability of obtaining x successes m n trials is given by 

X 1 (n — x) 1 ^ " 

it follows that 


( 1 ) 


Po[x.] 


(no!)* 2*1 

k Po^ 2o* 

II a:,- 1 (no - Xi)I 
1 


In searching for a critical region that will be independent of po , it is illuminat- 
ing to study the methods that were designed by Neyman and Pearson [5] for 
continuous distributions. These methods suggest that one should look for criti- 

k 

cal regions on the surfaces ^ x,- = constant. For this reason, instead of 

1 

Using (1) for constructing critical regions, it is desirable to study the conditional 

h 

probability distribution of the points lying in the plane ^ x, = IV, where iV is a 

1 

positive integer not exceeding kua . The conditional probability of obtaining 

h 

the point Xi , • • • , x* , when the point is restricted to lie in the plane = Ny 
will bo denoted by Po[Xi | N]. Its value may be obtained by dividing the proba- 

k 

bility (1) by the probability that the point will lie in the plane = N. If 
this latter probability is denoted by Po[iV], then 


(2) Pok|iv]=^. 

Since the sum of k independent variables each possessing the same binomial dis- 
tribution has a binomial distribution with n replaced by kn, it follows that N 
possesses a binomial distribution and that 


(3) 


Po[N] = 


(kno)\ 


N\ (kno — N) 


N leno-n 

Po 2o 


If (1) and (3) are substituted in (2), it will reduce to 


(4) 


PofelJV] = 


(no'Ymjkno - iV)i 

' JT " 

(kno) 1 n (wo — a:,) 1 


This conditional probability distribution in the plane £ x. = IV is independent 
of Po and therefore may serve as the basis for constructing a critical region that 
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is independent of po for testing Ho against Hi . It will therefore be possible to 
test the less restrictive hypothesis 

Ho : n = no 


against 


Hi : n = ni > no . 


4. Best critical region. Although a best critical region does not exist for 
testing Ho against , it is helpful to proceed as though one did. 

h 

If a critical region of size a could be selected in each plane X) = N, 

1 

{N = 0, 1, ■ ■ • , A:no), then the totality of such critical regions would constitute 
a critical region of size a that is independent of po and which therefore could be 
used to test H'o against Hi . For, if Po [X « C.E..] denotes the probability that 
the sample point, which will be denoted by X, will lie in the critical region, it 
follows that 

JLno 

Po{X € C.R.] = S Po[H]Po[X e C.R. I A] 

AT-O 

/t\ fcnft 

= E Po{N\a 


a. 

This last equality follows from the fact that the sample point must lie in one of 

k 

the planes S = iV", (iV — 0, 1, • • • , Atio). 

1 

Furthermore, this would be the only critical region of size a independent of 
Po , because if a critical region of size ajv , (A^ = 0, 1, ■ * • , /bno), were selected in 

h 

the plane ^ Xi = N{N = 0, 1, • • • , kn^, it would be necessary that 
1 

Anq 

E PoimaH = cc, 

AT—Q 

independent of the value of po . From (3) this is equivalent to requiring that 

independent of the value of po • Since the left side bf (6) is a polynomial in po , 
its constant term must equal a and all other coefficients must vanish. It will be 
observed that no terms of the sum in (6) that arise from N > r will contribute to 
the coefficient of po ; consequently this coefficient will not contain the imknowns 
o-r+i , • • • , • These considerations show that the an must satisfy equa- 

tions of the form 
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a = Cooao 
0 = Cio ao + Cii ai 


0 — Ctiijoao + C*„,x£>!i + •• • + Clnjinj Cfinj . 


It will also be obseiwed that €„ = (fcno) — »■) !; consequently the triangular 

matrix of the coefRcients in these krio + 1 non-homogeneous equations is non- 
singular. The equations therefore possess a unique solution, namely the known 
solution of ttAT = a 

The preceding discussion shows that it is necessary to find critical regions of 

k 

size a in each plane = N, {N = 0, 1, ■ ,kn^) ,ii a, critical region indepen- 

1 

dent of pa is desired. If each such planar critical legion were a best critical 
region for that plane, then the totality of such regions would constitute a 
best critical region independent of po for testing H'a against Hi 

It follows from the theory of best critical regions [5] that if a best critical region 

K 

in the plane ^2 ^ exist, it would be determined by the inequality 


( 7 ) 


PaUAN] 

PiMm 


<K, 


where Pi corresponds to Po when Hi is true and where K is a constant whose value 
is chosen to make the critical region one of size « Now from (4) , 

Paixi I N] _ {no ' )\kno - NVihnAmjni - x!)\ _ 

Pi[a;, I iV] {ni\Y{^ni - iV)'{^«o)''n(no - Xi)\ ' 

In order to study the possibility of a best critical region, it is therefore neces- 
sary to study the possibility of (8) satisfying inequality (7). 


6. Approximate best critical region. Unfortunately, because the variables a;, 
are discrete, it is not possible to find critical regions of exactly size a for arbi rary 
« as required in (5) , Consequently it is necessary to introduce contmuous ap- 
moxSng funeVom dtaretc prob^My funefions or to rosort to otte 
dovicos it o'iUcal togions of the ln» discused m the preeedoig Motoo 

'"‘Fttirturpoee of talro.h.cmg eueh .pproximatione, (8) rvdl be emtte. m 
the following form: 


PoklN] ^ (fcno - N)lY l\ 
'nCno-!!;,)' W 




n(ni - w 


{kni-NY(^ 


( 9 ) 
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where Ci is independent of the variables Xi . It will be observed that the ratio 
on the right is a ratio of two multinomial functions. Now the multinomial 
function 


N\ 




aiilsal . - . a*! ' 

h 

where ^ Xi = N, can be approximated by the multivariate normal function 

(27riV)*'*’ ’’ s/piVi, ••• Pk 


( 10 ) 


The approximation is good provided the N'pi are large and the .Xi remain away 
from their extreme values. If this approximation is applied to both numerator 
and denominator of (9), to this order of approximation, 


(11) 


PofelN] 

PiNIN] 



1 

1 1 

/ X,- N/k VI 

Wno - N/k) J 


[27r(A-no - 

■ iV)]“*=-i> 


/c*^*cexp 

1 1 

1 

( X, - N/k \ 

■] 

ni — N/k) 


27r(/fni - 




f/cni — r 1 ni — no 

Uno - nJ L ^ (ni - N/k){no - N/k) 

• S (». - N/Zc)*] . 


Since, by hypothesis, ni > nj and Wp > JV//c, except for the case of no = N/k, 
which will be considered later, it follows that 


(ni - N/k){,no - N/k) 

h 

As a consequence, the right side of (11) will decrease in value as (xf — N/k)^ 

1 

increases in value. If (®i , • • ■ , xt) is a point lying on the sphere 

(12) i (x, - N/k)^ = E 

1 

and if the coordinates of this point satisfy inequality (7) when approximation 
(11) is used, then all points outside this sphere will also satisfy (7) to this same 
order of approximation. A best critical planar region of size q; in this approxi- 

' k 

mate sense can therefore be obtained m the plane ^ xt = N by determining a 
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sphere yith center at (N/k, ■■ ■ , N/k) such that when III is true the probabUity 
IS a that a point lying in the plane will he outside this sphere. Furthermore, such 
a region will be a common best critical region for all values of m > no because 
the preceding arguments do not require the value of ni but merely the knowledge 
that ni > no . ® 

For the purpose of determining the radius of the sphere that will yield the 
desired critical region, (4) will be expressed as follows: 


(13) 


Pok- 1 JV] = Co — fiy 

n(no-x.)!W ’ 


where <h is independent of the a;, . If these multinomials are replaced by their 
multivaiiate normal approximations as given by (10), to this approximation 
(13) will reduce to 


Po [.Ti I i\n = Cj c exp 




( 14 ) 


== Ca (3 exp 


2 


E (Xi - N/h) 


r_i V 

. ^ 1 VVno - AT/i-y J 


k V J 


where Co is independent of the xt . Since E *. = here, Xk may be expressed 

in terms of the remaining variables; consequently (14), except for a constant 
factor, may be treated as a normal distribution in the variables xi , • • • , Xk-i . 
If the factorials in Ca are replaced by their Stirling approximations, it wiU be 
found tliat Ca is the correct constant for the normal distribution. 

Since it is known [6] that —2 times the exponent in a normal distribution func- 
tion possesses a chi-square distribution, it follows that to this order of ap- 
proximation 

E (^.- - N/kf 

(16) 


N( 

1 

k [ 

kth) 


possesses a chi-square distribution with fc — 1 degrees of freedom. If Xo is a 
value such that P[x’ > Xa] = «) then 


( 16 ) 


E (;»* - N/kY 


N( 

1 

k \ 

knoj 


= x« 


determines a sphere such that to this order of approximation the probability is 

k 

a that a point lying in the plane 'Z,x, = N will lie outside the sphere. From 
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the arguments following (12), it therefore follows that a common best critical 
region in this approximate sense for testing Il'a against Il'i will consist of that 

K 

part of each piano 23 = N, (N = 0, 1, • • ■ , kna), which lies outside the cor- 

1 

responding sphere given by (16). Since the t, are non-negative and do not 
exceed no , the planes corresponding to hf = 0 and N — krit, contain a single 
point; therefore it is necessary to adopt some convention that assigns lOOa per- 
cent of the samples with N = 0 and N = kno to a critical region in order to obtain 
critical regions of size a in these two cases 
For a given set of data, the procedure to lie followed then consists in calcu- 
lating the statistic 

k 


2 (a:. - Sif 



k 

where .r = 23 and agreeing to reject the liypothesis that n = no in 
1 

favor of the alternative hypothesis that n > no if and only if z > Xa , where 
P[x^ > Xa| = a for k ~ I degrees of freedom Because of the nature of the 
approximations used in (10) and (1 4)’, this result may bo expected to be accurate 
only if .c and Uo — x are large. 

The interesting feature of this result is that tlie familiar binomial index of 
dispersion, z, possesses optimum properties in this approximate sense for testing 
n — Ua against n > no. 

6. Poisson application. Since the precedmg test will possess approximate 
optimum properties for n as large as desired, independent of the value of p, 
and since a Poisson distribution with parameter m can be airproximated as 
closely as desired by means of a binomial distribution with np = mhy allowing 
n to increase sufficiently, it follows that the test will also possess approximate 
optimum properties for deciding between a binomial distribution with n = no 
and a Poisson distribution. 


7. Estimation of n. Although the puiposo of this paper has been accomplished 
in tho preceding sections, it is interesting to observe the role played by the closely 
related Poisson index of dispersion in the cxlimation of n. 

Approximate confidence limits torn may bo obtained by means of (10). 
If Xi-a is a value of x such that P{x > xi-o] = 1 — a, then, to this same order 
of approximation, tho probability is 1 — 2a: that 



23 (^i - 5)” 

1 



< 


2 


Xa 
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If these inequalities are solved for n, the following 100(1 - 2a) percent approxi- 
mate confidence limits for n will he obtained. 


(17) 


2 


Xa 


ixl 

2(x, — xy 


< n < 


a 


2 


Xl-a 


- 2 

S(a;i — xY ' 
X 


Only the lower limit here will possess optimum properties Now it will be ob» 
seiwcd that only positive values of n will be admissible if 

2(a:. , 

5 ^ Xl-a, 


whereas only negative values ivill be admissible if 

X (Xt ’ ^) V, 2 

T > X«- 


The range of values will be infinite in each case if there is equality rather than 
inequality. If, however, 


Xl— o ^ ^ Xa 1 


then both positive and negative values of n over infinite ranges will be admissible. 
Since n increases as the Poisson index 2(a;, — x^/x increases until it becomes 
infinite and then increases from minus infinity through negative values, (17) 
may still bo thought of as giving an interval (infinite) of values with a positive 
“low'er” limit and a negative “upper” limit. Thus, the familiar Poisson index 
of dispersion plays an interesting role in determining whether a Poisson assump- 
tion is reasonable as far as admissible values of n are concerned. 

If the population is tmly binomial, negative values of n must be ruled out; 
consequently a Poisson assumption becomes increasingly tenable as the Poisson 
index increases However, experience has shown [7] that a negative bmomial 
distribution is often more realistic in describing data supposedly drawn from a 
binomial or Poisson population than is the assumed distribution; consequently 
a negatit^e binomial should be given consideration if (17) juelds only negative 
values or if it yields a negative “upper” Imut that is numerically small relative 

to a positive “lower” limit. ... . ' 

It is also interesting to consider the point estimation of n. Here, it is cus- 
tomary [7] to estimate n by means of 

Jii% 

T^ixi - xy 

k ^ ■ 

Thus, a positive, infinite, or negative estimate for n will be obtained according as 
the Poisson index is less than, equal to, or greater than k. 
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BILINEAR FORMS IN NORMALLY CORRELATED VARIABLES 

By AjjLen T. Craig 
Uninersity of Iowa 

1 . SununRry t If ti vunRblG x is norniRlly distributed with mean zero, we hRve 
previously given a necessary and sufficient condition (see references at end of 
this paper) for the independence of two real symmetric quadratic forms in n 
independent values of that variable. This condition is that the product of the 
matrices of the forms should vanish'. In the present paper, we have proved 
that the same algebraic condition is both necessary and sufficient for the inde- 
pendence of two real symmetric bilinear, or a real symmetric bilinear and 
quadratic form, in normally correlated variables. 

2. Introduction. In this paper, we determine the moment generating function 
of the joint distribution of two real symmetric bilinear forms in certain normally 
correlated variables and derive a necessary and sufficient condition for the 
independence, in the probability sense, of these forms. We further investigate 
the condition for independence, in the probability sense, of real symmetric 
bilinear and quadratic forms. 

3. The moment generating function of the distribution of real symmetric 
bilinear forms. Let the two variables x and y have a joint normal distribution 
with moans zero, unit variances and correlation coefficient p. From this bi- 
variate distribution, repeated random samples of n pairs, say (xi , yi), {xt , yf), 

■ • • ) (av , 1/«)i arc drawn. Let C = H || be a real symmetric matrix and write 
fl = 22 CikX/yk ■ The moment generating function of the distribution of 6 
is then given by 

^{t) = E[e‘'] = L 

where 

Q = S (»; + y'i - 

2(1 - p2) j 

and 0 is defined above. If we subject the ai’s and y’s to the same linear homo- 
geneous transformation with appropriately chosen orthogonal matrix L, then 
Q remains invariant and 0 becomes 'ZfKfo\yj where the X's are the n real roots of 
the characteristic equation of C, that is, of | C -Ml = 0. The integrations 
are then easily effected and we find that 

ip(t) = {11 [1 ~ ~ ^^P ~ > 

= { I J - t(p + !)(/ 1 ■ 1 / - i(p - I 

= \i -2ptc - {1- 
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■where I ia the unit matrix of order n and the vertical bars, as usual, indicate the 
determinant of the enclosed matrix. 

Next, let A = |1 a,* || and 5 = || 6 ,a. || he two real symmetric matrices each of 
order n. Write 6i = Xlanx/m and = 1,'ZhjkX,'yk w'hcre the x’s and j/’s are the 
items of the sample randomly dra'wn from the bivariate distribution previously 
described. The moment generating function of the j oint distribution of 6i and fla 
is then given by 

, U) = 

= (2r\/ir=V)-’' r r dv„ dx„--- dyx dxx, 

tl—to 

■where 6i , 02 , and Q have the meanings previously assigned to them. If we 
pursue a line of reasoning similar to that above, we find that 

vik , fa) = 1 / - 2p(faA + UB) - (1 - p^)(lA + hBY 1 

4. The independence of bilinear forms. It is clear that there exist positive 
numbers, say h\ and hi , such that ^(fa , fa) exists for 0 < fa < hi and 0 < fa < hi . 
It is well known that a necessary and sufficient condition for the independence 
of 01 and Oi ia that «5(fa , fa) shall factor into the product (p(fa , 0)<p(0, fa) . If then, 
we assume 0i and 02 to be independent, we have essentially 

1 J - 2p(faA + faB) - (1 - p’KfaA + hBf 1 

^ =11- 2pfaA - (1 - p^)tW I • 1 B - 2pfaB - (1 - p“)i5B’ |. 

If h denotes the smaller of hi and , then the factored form holds for 
0 < fa , fa < h, and hence for all real values of fa and fa . In particular it holds 
for fa = fa so that 

1 J — 2pfa(A + B) — (1 — p)Ci{A + BY 1 

= 11- 2pfaA - (1 - Y)lW 1 • 1 1 - ^pkB - (1 - p“)(iB“ 1. 

Let TifTi, and r < ri + r2 denote the ranks of the matrices A, B, and A + B. 
Further let the real non-zero roots of the characteristic equations of these ma- 
trices be denoted respectively by «i , a2 , • • • , an , /3i, P 2 , ■ • ■ , Brj , and 71 , 72 , 
■ • ■ , 7r . Then the members of the preceding equation may be written 

n [1 - hip + l)7il[l “ hip - 1)7.1 

and 

n [1 — kip + l)a:.][l — kip — l)a,[ II [1 — fa(p + l)d.][l ~hip — l)/9il 

respectively. It is seen that the left member is a polynomial in fa of degree 2r 
and that the right member is a polynomial in fa of degree 2(ri + n ) . Accord- 
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ingly, r = ri + rj and the roots 71 , • • • , 71 consist of the roots , ■ ■ ■ , , 

’ ” ' ’ independent, then the rank of a'+ B 

is the sum #the ranlcs of A and B and the non-zero roots of the characteristic 
equation of A + 5 consist of those of the characteristic equation of A together 
with those of B. Further, if in (1) we put U = tik, where v is real, we have 

1 1 - 2pti{A + vB) - (1 - p’)t\(A + vBY I 

*=17- 2pkA - (1 - 1-17- 2phvB - (1 - pY.v^^ |. 

Denote the rank oi A vB by r' and the non-zero roots of its characteristic 
equation by 5i , • ■ ■ , 5r' • The immediately preceding equation can then be 
written 

r* 

n [1 ~ h(p + 1)3, ][1 — hip — 1)S,] 

I*-" I 


ft[i — hip + 1)“{1[1 — hip — l)a.] n [1 ~ hip + 1)«|8,][1 — hip — 


i-i 


From this we infer that, apart from zero roots, the roots of the characteristic 
equation of A 4- vD are ai , • • • , a,; , , • • • , yjSr, . 

If a symmetric matrix, say fli7(«i), has elements which are real polynomials 
in the real variable v, and if the determinant 

\Miv) - X7 [ = (-D^IX - pi(«)][X - psWl • • • [X - p,(u)], 

where pi(u), piiv), • • • , Pniv) are likewise real polynomials in v, then there exists, 
for all real values of v, a real orthogonal matrix, say L(w), such that 


L'iv)Miv)Liv) 


pi(u) 0 ••• 0 

0 piiv) 


0 


pAv) 


Furthermore'. exists for all real values of v. Since 

dv 

\A+vB-\I\-= (~1)"X"“‘'‘+^’’(X - ai) • ” (X - ar,)(X - • • • (X - V^r,), 


1 A number of yearB ago, in oonnecinon with another problem, the writer sought the as- 
Bistanoe of Professor N. 11 McCoy for a proof that iW is differentiable at v = 0. Pro- 
fessor McCoy’s elegant demonstration of the existence of Uv) showed that each element 
of this orthogonal matrix is itself a real polynomial in v, divide by e poai ive 
root of another real polynomial, which polynomial is never negative and which vamshes 
for no real value of V. Thus the derivative of Uv) exists not only or r - 0 but for all 
real values of v. The writer thanks Professor McCoy for his kind, and generous assistance. 
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then A + vB belongs to the class M(fi) so \vc have 

lo-i 0 


(2) 


jyiv)U + vB)Liv) = 


0 • • • 

0 • • • v^i 


0 • • • UjS,, • • • 0 


In particular, 


( 3 ) 


L'(0)AL(0) « 


ai • • • 

0 

Orj 


0 


o • • 

0 


If we differentiate (2) with respect to v and subsequently set t) = 0, we have 

lO ••• 01 


(4) AUO) + L'(0)BL(0) + Z/'(0)A^^ 

av dv 


/3i 
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KiucP fAv) ,s nrllu-Konal. lUeu L'(.)L(.) = I. Upon clifferentiatmg both mem- 
l,ln•a^^illllvspcM•(,U)^alulsul)a(;,luehtlyKcUiugl. = 0, it is aeon that LCO) = 

-um -£ ^ m llm(. /,'CO) 'UM is a skow-aymmokio matrix, say S. Further 

(,5) . _jy(o) 'im ^ 

and, liy taking oonjugatcH, 

(0 . _I,(0) i(0) . 1(0);,, 

If wc ninltiply (5) on the right by AJu(O) and (6) on the left by L'{Q)A, we see 
that (4) may bo nnltten 


(7) 7/(0)7JL(0) = 


+ SL'(0)^L(0) - L'(0)AL(0)S. 


Since S is skew-symmetric and since L'(fl)AL{0) is given by (3), then ea.ch 
clement on the principal diagonal of SL'{0)AL{0) and L'{0)AL{0)S is zero. 
Further, since 77(0)717/(0) is symmetric, then L'{0)BL{0) takes the foim 
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Because Uie iioii-m'o vddIh of llici cluirarlprislic ('niiiilli)n uf [irc , 

■ ■ ■ , Hri Uicii Uie KUiu dl' all luii-niwcil pniiciinil iniiior.s uf (he (l(‘lc'nuiiiiiii|, of 
T7(0)7iL(()) musL cciual llic siiiu i»f Ihc. pruilucls of /7i , • ■ ,li,, taken two at a 
lime. That is 

= Eftft - s/4, 

i<i »<; 

SO that each ha , being real, i.s W'ro. Accordingly, »S'I;'(0)AA(0) — L'(0)A7 j(())iS 
is a zero matrix and L'{0)]jfj(i)) is given by the first term in the right member 
of (7) . We then have 

/y(0)ALC0)7/(0)7?L(O) = Z,'(0)Ai7A(0) = 0, 

from which it follows that /I7i '= 0. Tlrus, if the real symmetric bilinear forms 
Oi and 62 arc independent in the prohahilitj'- sense, the product of their matrices 
is zero. 

If, conversely, AB = 0, then 

fill , fe) = I r - 2p(u + kii) - (1 - + ilB^) 1 

= 1 [7 ~ 2plul - (1 - p*)/lB][7 - 2 pkB - (t - py,B^] \ 

— , 0)¥>(0, fj), 

and (?i and 0^ are independent. This estalilishcs the follo^ving thooronr. 

Theorem I. Lei. x and y ha normally correlated with means zero, unit variances, 
and correlation coejjidcnl p. Lei 0i and he two real synmeiric bilinear forms m n 
random pairs of values of x and y, say (xi , j/,) , (xi ,yf), ■■■ {xn , l/n) . A necessary 
and sufficimt condiiion that 61 and 6 a he independent in the prohalnlity sense is that 
the product of the matrices of the forms he zero. 

5. Simultaneous reduction of quadratic or bilinear forms. The argument 
of Section 4 may be used to establish in a very simple manner the following 
theorem. 

Theorem II. Let A and B he two real symmetric matrices with constant ele- 
ments, each matrix of order n. A necessary and sufficient condition that there exist 
a real orthogonal matrix of order ji such that simultaneously each ofL'AL and L'BL 
is in canonical form, wherein no non-zero elements occupy corresponding positions 
on the prirmpal diagonals, is that AB = 0. 

For if such an orthogonal matrix L exists, it is evident that L'ALL'BL = 
L'ABL = 0 from which it follows that AB = 0. Conversely, if ATI = 0 , then v 
being a real scalar, the matrix [A - 'hI)ivB - \I) is equal to the matrix 
->i[(A + vB) - X7]. These matrices being equal, tlicir determinants are 
equal so that A + vB belongs to the class J'7(u) of section 4 Thus L may lie 
talren as 7/(0) and simultaneously L'A L and L'BL are of the form stated in the 
theorem. 

6 . Independence of bilinaer and quadratic forms. Let 9 = 2 Sa,iE,j/*, be a 
real symmetric bilinear form of rank Ti in the previously defined variables 
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C.n , lh\ ■ ■ ■ , (-('n , y,i) and let q — I,i:ibjix,xi be a real symmetric quadratic form 
ol rank /v i" 'I’l , -'I'l; , • ■ • , Ch . As usual, clcuok'. the nou-zcj'o roots of the cliarac- 
tei’islie eciiuilioiis of A and A by ai , tvj , ■ ■ ■ , 01 ^ and 01 , j3j , • • , 0rj respectively. 
The moment generating function of the ]omt distribution of 6 and g is 




(2' 


1 f” r® 

irVl — p®) i-oj J.„ 


dt/„ dx„--- dyi dxi , 


where, as previously, 


“ 2 (i - pi) “ 2px,y,). 

We first orthogonally transform the variables so that the exponent in the inte- 
grand becomes, uponnuiting ||/,it ||,= L'BL, 

k^a,xWi + + y'; - 2pxW,)- 

We then integrate on j/J , 1/2 , • • • ,y'„ and obtain for the exponent in the inte- 
grand 

U11^Ux\x„ - \-LXi + pii5:«,a:7 + — ^ Al^a]x' . 

Tf we effect on the variables x[, , a;'„ the inverse of the orthogonal trans- 

formation initially used on the a’s and j/’s, the exponent in the integrand becomes, 
using II dju 11 = 

k'L'Zhjk^'lXk — + piiS2aikX,Xk -1 t\'L'L(j,kX,Xk 


or 

— — 2pUa,,k — (1 ~ ~~ 2ki]h]xjXk , 

where Sjk equals 1 or 0 according as j does or does not equal k. Hence, 

(8) ‘P{h , fa) = 1 1 - 2phA - (1 - p^)tW - 2kB I 

It 9 and q are independent, we have 

(0) 1 7 - 2ptiA - Cl - 1 

= 1 7 - 2phA - (1 - pYkA^ I • 1 7 - 2faB I, 

for 0 < /, < hi and 0<k<h. As before, the member, of (0) are polynomials 
which, being equal for 0 < k , k < h, are equal for all real value, of fa and fa . 
If we put fa = 1 and fa = wfa = v, where v is real, then [9) becomes 

1 7 - 2pA - (1 - pV’ - 2rH 1 = 17 ^ 2pA - (1 - 1 • 1 7 - 2rB 1 

= ^[1-(P-1) aj][\ - (p + D^jlftci - 2r0,). 



all];n 1, citAici 


r)72 


That, is, 

1 2p/l + (1 - p‘)A^ + ‘2vJi - X/ 1 

= (_irx"-''‘-'^>>[X - 2p«i ~ (1 - pJ«?l---[X - 2p«., - (1 - p-)«%.i 


• [X - 2p/3t] • • • [X — 2ii(3rJ 

,so tliat 2pA + (1 — p^)A^ + 2p]i is ainiilrix of the class ITencp we wiite 

;{ 2p0fl + (1 ” p’)^! • ■ ' Oil 


U{v)[2pA + (1 -p“)y|’ + 2pJ3]/.(f) = 


2parj + (1 — p'^)e/li 
2vPi 



0 


The argvunent of eeoUon 4 shows that 77(0) [2pA + (1 — p’)j‘iV^(0)7v'(0)2J?7v(0) 
is a zero matrix, from which it follows that 2pAB + (1 — p“) A^B = 0. 11 ut 
this imposes on p, 7i’ conditions of the form 


2pi,i + (1 — p’)7%a = 0, (j, /c = 1 , 2, • ■ , n ) , 

Since these hold for every —1 < p < 1, they hold identically, Hence oaoh Ijh 
and vijii is zero. In particular, H Ijt, j| = AB = 0 if 0 and q are independent, 
Conversely, if AB = 0, rve sec by Theorem II that (8) becomes 


vik , h) — <£>(<1 , 0)p(0, h ) , 

SO that 0 and q are independent. This yields Theorem III. 

Thkoeem III. Let .r and y he normally correlated with means zero, unit vari- 
ances, and correlation coejjident p. Let 6 be a real symmetric tnlinear form in the 
n random, pairs of values of x and y, sa/y (^i , 2/0, • • ■ , (x„ , y„), and lei q he a real 
symmetnc qvadralic form in .Ti , s’: , • ■ ■ , a;„ (or yi , ■ • • , j/„) . A necessary and 
sufficient condition that 0 and q he, independent in the prohaUHty sense, is that the 
product of the inatriccs of the forms he zero. 

For cxfimple, lot B be n times tlie sample covariance and let q be n times the 
s(iua.rc of the mean of the x's. Then 

0 = 2(x, - x)(i/j - y) 


= H'SajkXj'Uki 



BILINEAH FORMS 


573 


■where 





^ J = k, 


and 


= “ - otherwise, 
n ’ 


q — n£“ — S25ytxyx* , hjk = 1/n forj;, ^: = 1, 2, • • ■ , n. 

Since AS 0, then 0 and q are independent, a fact otherwise known but perhaps 
not so easily established. 
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ON THE CHARLIER TYPE B SERIES 
By S. Ktillhack 
(korge Wanhngtm Unmrsiiy 

1. Introduction. The Type B Korics of Charlior has been discussed in some 
detail in the literature (Sec refercnecs at the end of tlic paper). The problem 
of tlie convergence of the Type B series has been considered by Pollaczek- 
Gciringer [12], [13], Szegfl [12] (page 110), Uspensky [10], Jacob [5], Schmidt [16] 
and Obrcchkoff [11]. There is presented m the following a method of develop- 
ment of the Type B scries which is believed to be of some interest, including a 
necessary and sufficient condition for the convergence which is basically the 
same as that of Schmidt [IG]. A result of Steffensen [17] is extended and shown 
to be related to the Clmrlier Type B senes. 


2. Statement of results. Consider the function pir), defined for r - 0, 1, 2, 
• • , and such that 


(2,1) E2j()') = 1; Elp(f)l=^' 

' f-o f-o 

where A is some finite value. Lot the n-th factorial moment bo defined by 
M(o) = 1 


( 2 . 2 ) 


M(n) 


= E r(r — l)(r — 2)- • • (r — n -(- l)?j(r), (u- = L 2, ‘ • ■) 


r~0 


For arbitrary A let 


Bn MOO 


n{n — 1) 

— A H M(n-2) A 


(2.3) 


2 ' 

- n(n - m - 2 ) ^ ^ 

u 1 


We prove the following results; zs 

Theobem. a necessary and sufficient condition that the Junction p(r) oj (2.1) 
may be expressed by the absolutely convergent series 


(2,4) 


vv) = "TP + 


d\ rl 


L2 r^A^ j 

'*■ 21 aA!‘ rl 


is that 

(2.6) 1 + I M(i) 1 + ii 1 M(2) 1 + ii I M(s) I + • ■ ; + Pi 1 I + ' ’ ‘ 


converges where Ln is defined as in (2.3) . 



CIIABLIBR TYPE B SERIES 

575 

3. Generating functions. For the function v(r) of ^9 1 ^ -j 

crating function defined by ^ consider the gen- 

< 0 ( 2 ) = L z'pir) 

r—0 

whem s is . oompfa varmble. Eecau* of ( 2 . 1 ) a a a„i 

of (3.1) 18 uniformly and absolutely convergent for I z I < l so t>in+ +f, t 

of convergence of (3.1) is some value fB, >r ' ' ° 

The Taylor expansion of piz) about the point z = 1 is given by 

(3.2) p{z) = ^(1) + (z - iy(i) + 
where, as may be readily obtained from (3.1), 

(3.3) (1) = g r(r - l)(r - 2 ). • • (r - n + l)p(r) = 

If it is assumed that (2.5) converges, then 

(3.4) p(.) -! + (.- !)„,„ + ^ . . . 

is uniformly and absolutely convergent for 1 z — 1 | g 1, 

4 . Sufficiency. For arbitrary X let us set 

g-X(t-« {l+ g(j)(z - 1) -b .. -I \ 

(4.1) ^ 21 / 

-l + t,(s-l)+^(.-l)’ + ... 

where the right member, because of (3.4) is absolutely convergent for 
I z — 1 I ^1. The coefficients on the.^right side of (4.1) are given by 

(4.2) Ln = P(a) — na(„_i) X + U(n- 2 ) X* — • • • + (— l)"x" 


' 

- U(«_2) A 


+ (-i)"x" 


and the factorial moirjpnts may also be expressed by 


Ln + nLn-i X + ^ 2 l~" X* + ■ • + X". 


These relations are readily derived by expressing (4.1) symbolically as 

(4.4) g-x(.-j) +M>-i) _ gto-c 

where after expansion fi" and L" are to be replaced by m;.o and Ln respectivel,y. 
(Of. Jordan [7], p. 39). From (4.1) and (3.4) there is now derived 

(4.5) p{z) = (^1 + Li(z - 1) + §-[ (z - 1)“ + • ■ ■ ) . 
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Since the right mem1)er of (4.5) is alisolutely and nniformly convergent for 
I 3 — 1 1 g 1 for arbitrary X, it may be expressed as 




Since the radius of convergence of the right member of (4.6) i.s some value ffj 
such that! 3 — 1 I < f?j > 1, it maybe expressed as a power series about 3 = 0, or 


(4.7) .(l + L.± + ^« < +...),-> (l + ^. + " + . . .) 

Recalling now the definition of <p{z) as given in (3.1), there is obtained by equat- 
ing coefficients of like powers of z in (3.1) and (4.7) 


- q. r 4.^ 4- 


Rinco it may bo readily shown that 


aX" r! ' r[ 


where 


r! rl (r — 1)! 


A" = A"-'- — A" ^ ^ ^ 

r! r! (r— 1)1 


VC may also write (4.8) as 


(4.10) ,Ar) = i-± 


Tjj . 2 a ^ X”^ Jja . 3 e ^ X 


Li .36 I 

3 T^ "fT 


6. Necessity. Assume that the function p(r) of (2 1), for arbitrary X, is 
given by the absolutely convergent series 


= (i + ^>|( + | t |:2 +■■■)' 


Since e~’‘X7i’l is continuous with respect to X, there follows, where 3 is a complex 
variable and 1 z 1 g 1 

oo 00 r —X \ T rv 00 f —X ^ r r -vZ eo r —'X i r 

V'r/\ ■^ze X , j a x^z e X , a -^z e X , 

2^ 8 vir) = 2^ -—J— + 2. — -n— + h ■ ■ • 

r-0 r».0 Tl oX r— o TX Z \ oX^ r-O 7^1 

(6 2) = + 2,(z - 1) + ~,{z - 1)=* -H -") 

= 1 -b Mdz - 1) + (z - 1)“ + ff - 1)' + • ■ • 
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where 

(5.3) il/n = Ln + X + ~ L„_, x“ + ■ + x" 

From (5.2) il follows that 

(5.4) Mn = fl(„) 

where n^n) is as defined in (3.3). Since (5 1) becomes for r = 0, X = 0 

(5.5) 1 — Mu) + 21^® “ ' 

the assumed absolute convergence implies that 

11 1 

(5.fi) 1 + 1 fim 1 + ^1 I ^ I ‘ 9^ 1 1 + • • • 


converges. 

6. Remarks. Obreohkoff [11] .shows that his result includes those of Pollaczek- 
Geiringor [12], SzegS [12] (p. 110) and Jacob [5]. His theorem states that if 
the fuucitiou p(v), {r — 0, 1, 2, •), satisfies the following conditions 


( 0 . 1 ) 


^2''/ I p(r) 


is convergent for each finite number A , and 


(n + l)!^iA' r 

tends toward zero as n increases indefinitely then p(r) may be expressed in a 
convergent Charlier Type B series. 

Uspensky [18] show.s that if 


(0.3) 


S ?(»■) 


has a radius of convergence R > 2 then p(r) may be expressed in a convergent 

*^'^S^dMl\^]^lOTS a necessary and sufficient condition for the coii^ergence 

is that the function ^(z) defined as in (3.1) (he does 

condition (2.1) on p(r)) be regular inside the two circles | z < 1 and ] 

and with all its derivatives is continuous on the periphenc a ^ 1 

that pir) ^ 0, the condition (2.5) is stronger, m tact m this case Sihmidf [lU, 

shows that a necessary and sufficient condition is that 

liin p{r)2’'/ — 0 
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for all integral fc ^ 0. If p{r) ^ 0, then Uspensky’s condition is only just 
enough stronger than Schmidt’s to keep it from being sufficient. 

If (0.1) is satisfied, or if (0.3) is satisfied then (3,1) is absolutely convergent 
for 1 z I g 2. Therefore, the point z = 2 is contained in the circle of convergence 
of (3,2) or (3.4) which implies that 

1 4- I ;*o) I + I I + ■ ■ ■ tt! I 1 + ■ ■ ■ 

converges, 

It is deemed worthy of special mention to point out, as both Schmidt and 
Uspensky have done, the striking fact that the necessary and sufficient condition 
for the validity of (2.4) is independent of X. This arbitrariness of X enables us 
to dispose of it so as to obtain better convergence. Indeed if we set X = p(u 
then as is evident from (4.2) Li = 0. 


7. Special cases. It is of interest to note that (4.8) is the Taylor expansion 
if p(r) = (r = 0, 1, 2, • • • ), for then (4.2) becomes 

(7.1) = (,i - X)" 


since for the Poisson Exponential Distribution (/"''//r!, (r = 0, 1, 2, ■■•)) 
("l.S) is then 


(7.2) 


£V 

rl 


Cl 

rl 


+ (/I — X) 


iCl 

ax rl 


+ 


( m - x)^ s-^x^ 

21 ax* rl 


If p(r) is finite, that is if p(r) = 0 for r > n + 1 then mw) = 0 for fc > n. + 1, 
Thus, for a finite function the condition (2.5) is satisfied. 


8. Factorial moments. For functions p(r), (r = 0, 1, 2, • ■ •), satisfying (2.5), 
there may be derived from (3.1) and (3.4) the relation 

(8.1) r<p(r) = n(r) — M(hi) + ^jacr+j) — ^|^(r+s) + •■■ , (r = 0, 1, 2, ■ • •)) 

since each side is ^'’’(0) derived respectively from (3.1) and (3.4). It should 
be noted that for X = 0 (4 5) leads to (8.1) rather than (4.8) so that (8.1) may 
be considered as the Charlier Type B series for X = 0. The result (8.1) was 
derived for finite functions by Steffensen [17], (Also compare Kaplansky [8]). 
This may also be expressed symbolically by 

(8.2) p(r) = jrV'/r!, (r == 0, 1, 2, • • ■). 

where after expansion m” is to bo replaced by M(n) . It is of interest to note the 
relation between the symbolic expression for p(r) as a Poisson Exponential in 

(8.2) and the series (4 8), for (4.8) may be expressed symbolically as 

p(r) = (X + LV/rl 

= I 

since = /(x + a) and the relations (4.2), (4.3), (4,4). 
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9. Illustrations. Consider the function 
(9.1) p(r) = l/2'‘+\ 

For this function 


(9.2) 


OQ 

¥>( 2 ) = 22’'p(r) = 1/(2 - z) 


and 

(9.3) P<">(1) = M(„) = ni 


(’’ = 0, 1, 2, ■ • ) 


so that (2.5) bpcnmos 

(9.4) 1 + 1 + 1 + . . . 

which docs nut converge. (It may be of interest to note that for this case 
(8.1) yields 

(9.5) p(0) = l_i + i- i + ]_.... 


The series on the right in (9.5) is not convergent but is summable Ci to |. For 
the latter see for example II P. Agnew, [19].) In this case the first several co- 
cflicients of (4.8) are for ^ = 1, 

Li = 0, = .6000, 

( 9 - 0 ). 

= ,3607, ^ = ,3681, 

5! 61 

Let us now consider the function 

(9.7) p(0) = i p(r) = M', h = 1, 2, ). 

For this function 

(9.8) 'p(e) = Ilz^pir) = i 

rs»0 •J ® 

and 

(9.9) /”'0) = (n=l,2,.-.). 


I’ = .3333, = 3750 

^^ = . 3679 , 


so that (2.6) becomes 
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which convf rgc For this case (8,1) yields 

■ - ■ -©s + C-)^-©^^ ■ - » 
""■©s-KD^+l©?. — * 


etc. 

In this case, the first 


irst several coefficients of (4.8) are for X = 0.76 
X -^1 = .093750, I? = .046875, = .019043 


L, = 0, = .093750, = .046875, 

ii\ o ! 4 ! 

(9.12) 

= .010840, = .006173, = .002622, 

O' 01 71 

LoL us noAv consider the function (suggested hy Prof. C. Wexler) 

(9.13) P(0) = l ^ (’■ 


(»• = 1 , 2 , ... ). 


For this function 


2p(r) = 1, Si p(r) 1 = 5 

fnO riBiO 

= a^'vir) = 5/(3 + 2z) 

toaQ 


(9,16) p'"'(l 

In this case (2,5) becomes 


»'"'(!) = MW 


(- l)"nl (2/5)" 


(9.17) 1 + i + 

which converges and (8,1) yields 




P(0) - 1 +1 + (I) + (?) + •■• = 5/3 

) = ~2/6 - 2!(2/6)^ - |J(2/6)’ =~l'l 


etc, 

Note that for this case (6.1) or (6.3) are not 
found that 


satisfied. Using X = 1, it is 


(9 19) Li=-14, 


|!.1.06, I; --,5906, b 


t*. .2779. 
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NOTES 


Thts scolion is devoted to brief research expository articles on rnethodology 
and other short item. 


ON SMALL-SAMPLE ESTIMATION 

By Geouge W, Buown 
Iowa State College 

1. Summary. This paper cliacussea some of the concepts underlying small 
sample estimation and reexamines, in particular, the current notions on “un- 
biased” estimation. Alternatives to the usual unbiased property are examined 
with respect to invariance under simultaneous one-to-one transformation of 
parameter and estimate; one of these alternatives, closely related to the maxi- 
mum likelihood method, seems to be now. The property of being unbiased in 
the likelihood sense is essentially equivalent to the statement that the estimate 
is a maximum likelihood estimate based on some distribution derived by inte- 
gration from the original sampling distribution, by virtue of a “hereditary” 
property of maximum likelihood estimation. 

An exposition of maximum likelihood estimation is given in terms of optimum 
pairwise selection with equal* weights, providing a type of rationale for small 
sample estimation by maximum likelihood. 

2. Introduction. In large sample theory of estimation the problems arc 
generally formulated in terms of a random variable a: = (®i , xj , ■ • , Xn) and a 
product distribution with, say, a density ti(xl0) = f{xi\d)f{xt\d) 

where n is permitted to increase without limit. For small sample theory it is 
sufficient to consider an arbitrary distribution, not necessarily of product form, 
depending on a parameter 6. For convenience we will assume a distribution 
density of fixed form (/(xlfl), where x is in Euclidean n-space and 9 in Euclidean 
fc-space, k < n. Granting at the outset that a complete rationale for estimation 
must be based op considerations like those of W aid [4, 1939] dealing with specified 
risk functions, it is still a difficult process, in practice, to specify the risk functions 
and solve the ensuing mathematics problems. It may still bo to the point, then, 
to consider general properties that estimates might be required to have in order 
to be considered “acceptable”, or perhaps even “optimum”, over a class of 
“acceptable” estimates. 

In large-sample theory the situation is fairly simple. Consistent estimates 
have the properly that the estimate converges in probability to the true param- 
eter value. “Best” or “optimum” estimates are defined in terms of the order 
of convergence, or asymptotic variance. All reasonable definitions of “optimum” 
become asjTnptotically equivalent, since they all measure essentially the rate of 
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convergeiicG, SO that one might ask for „ 1 i. , , . 

deviation or least expected fcth power, without affecting the opta estate 
m general. Moreover, the consistency property and the opLum prorrt; 
arc m general mvanant under simultaneous one-to-one Wormation of th 
parameter and its estimate, i.e., the square of an asymptoticaUy optimum e ti- 
mate of cr will be an asymptotically optimum estimate of Imahy a generd 
estimation method, the method of maximum likelihood, leads to optimuL esti- 
mates m large samples. ^ 


In small samples, on the other hand, the search for correspondiog criteria has 
led to the investigation of best “unbiased” estimates, and the like, where few, 
if any, of the definitions discussed possess an invariance property under simul- 
taneous one-to-one transformation of the parameter and its estimate. 


3. Unbiased estimation. To ensure, in smaU-sample estimation, that an 
estimate bears some relation to the parameter it is estimating, it has become the 
custom to require that an estimate be unbiased, which means that the expected 
value of the estimate agrees with the parameter value. This condition was sug- 
gested by the consistency property which is required in large-sample estimation. 
It ensures, moreover, that the average of a large number of independent estimates 
made on the same basis will provide a consistent estimate, in the large sample 
sense. While this consistency property of the average may at times be conveni- 
ent in practical situations, the fact remains that the problem of estimation from 
a numlxsr of such observations is a different estimation problem, the “best” 
solution to which need not be the average of the “best” solutions of the original 
problem corresponding to estimation of 9 from a smgle observation on x, where 
X has a density {7(a:|6). More to the point, however, is the objection that an 
unbiased estimate of a parameter does not in general transform into an unbiased 
estimate when both estimate and parameter are subjected to the same one-to-one 
transformation. Moreover, one can easily construct situations for which the 
only acceptable unbiased estimates are clearly inferior from almost any point 
of view, to estimates which are biased (Girshick, Mosteller and Savage, [1, 1946], 
and Halmos [2, 1940J) . 

It may be of interest to consider a few reasonable alternatives to the lack of 
bias requirement, whidh seem to accomplish as much as the conventional defini- 
tion and which, in addition, have an invariance under one-to-one transformation 
of the parameter and estimate. To avoid confusion, let us attach the quahfying 
prefix "moan” to the usual unbiased property, so that on estimate will be said 
to be mean-unbiased if its expected value agrees with the parameter value. 

Consider as one alternative the following property An estimate of a one- 
dimensional parameter 6 will be said to be median-unbiased, if for fixed 6, the 
median of the distribution of the estimate is at the value 6, i.e., the estimate 
underestimates just as often as it overestimates. This requirement seems for 
moat purposes to accomplish as much as the mean-unbiased requirement and 
has the additional property that it is invariant under one-to-one transformation, 
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A different alternative requirement which is invariant under transformations 
is suggested by the definition of unbiased tests of significance (Neyinan and 
Pearson [3, 1936]) . Let us say that an estimate is likelihood-unbiased if \ O') < 
h{d\ 6), where the estimate d has probability density h0\6). In other words, an 
estimation method is likelihood-unbiased if estimates in the neighborhood of a 
given parameter value d would occur more frequently when the true value is 
itself 8 than when it differs from 0. On intuitive grounds this seems to be an 
acceptable kind of requirement, applicable to a very general class of estimation 
problems. It is evident that the assumption of a density plays no important 
role here; the situation is analogous to the maxunum likelihood situation. The 
property itself is invariant under simultaneous one-to-one transformations of 
parameter and estimate for the same reason that maximum likelihood estimates 
are invariant under such transformations, in fact one can readily see that the 
likelihood-unbiased condition is equivalent to requiring that 8 have such a 
distribution, as a function of 6, that the maximum likelihood estimate of 8 
based on d will be actually equal to L The obvious implication of this fact is 
that if a function 4>ix) is given (possibly a sufficient statistic for 8) then there is 
an essentially unique likelihood-unbiased estimate d based on (j>, obtained by 
finding the maximum likelihood estimate of 6 in the distribution of <)b as a function 
of 6. 

As an example, consider the estimation of a* from a sample of n observations 
from a normal distribution. Let S* bo the usual sum of squares, where 
is distributed like x’ on n — 1 degrees of freedom. Then the only likelihood — 
unbiased estimate of a based on <3“ is ^/{n — 1). In this case B^/in — 1) is 
also mean-unbiased, a fact which is normally quoted as justification for the 
division by n — 1. Curiously enough, it is customaiy to estimate a by 
"s/Syin — 1), even though this is a biased estimate of a, according to the usual 
notion of "unbiased", referred to here as “mean-unbiased”. On the other hand, 
•\/»SV(n — 1) IS a perfectly good likelihood-unbiased estimate of a, by virtue 
of the invariance under transformations. It might be pomted out, in passing, 
that the estimate S'^/in — 1) does not have mimmum mean square about (r\ 
but that the optimum divisor for minimizing the mean square error about 
is n -f- 1. 

The fact that a likelihood-unbiased estimate is the maximum likelihood esti- 
mate based on the distribution of the estunntc itself suggest further examination 
of maximum likelihood estimates. If we define a simple estimate as one which 
completely determines a probability distribution for x, then wo have as a theorem, 
the following: 

A simple maximum likelihood estimale S(x) is likelihood-unbiased. What tliis 
means is essentially that maximum-likelihood is “hereditary", i.e. if h{x) maxi- 
mizes g(x 1 fl) in a space of n dimensions, and § has a derived density hi^ \ 8) 
in a space oik <n dimensions, then 8 = & maximizes h{§ \ 8) . The proof follows 
readily from the fact that h(d [ 8) is obtained by integration of g{x \ 8) over all 
0 ! such that d(a:) = h. , 
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Ihe example of estimating c, quoted above, shows that the word “simple” 
caimot be omitted from the statement above. For example, the simple estimate 
m the paient distribution is the joint estimate {x, S’/n) of (m, and in fact the 
joint estimate is likelihood-unbiased. On the other hand, S^n is not a simple 
maximum Jikelihood estimate, and we observe that S^/n is not likelihood-un- 
biased. S /(n - 1) is a simple maximum likelihood estimate of v based on 
the distribution of itseK, so that S^in - 1) is, as a result, likelihood unbiased. 

One can exhibit situations in which the conventional mean-unbiased property 
is vety unnatural, while the likelihood-unbiased property may be quite natural. 
Consider, for example, the case where a is to be estimated by use of a x“-dis- 
tributed with n - 1 degrees of freedom, but subject to the condition v’* > d? , 
Avhere <rl is known in advance. Then the estimate v® = max [S^/{n - 1), <rl\ is 
certainly biased according to conventional definitions, but is nevertheless, likeli- 
hood unbiased. To get a mean-unbiased estimate when is near to crl is im- 
possible except by admitting estimates less than o-§ , which is clearly foolish if it is 
known that a-^ > al . 

It may be of interest to include a brief discussion of maximum likelihood esti- 
mation in terms of pairwise selection of alternatives, providing a sort of optimum 
property for maximum likelihood estimation in small samples, in addition to the 
likelihood-unbiased property. Consider a choice to be made between only two 
alternative values of 6, say 8o and 6i , by dividing the sample space into two 
regions So and Si , such that flo is accepted when x falls in So and di is accepted 
when X falls in Si . Then 

P.,(iSo) + Po,(S,) = Po.(&) -b Po,(Sl) = 1. 


P«i(*‘^'o) is the probability of making the error of accepting 8o when 9 = 8i and 
1 — P«o(»S’o) is the probability of making the error of accepting 6i when 6 = 6o. 
If the two errors are weighted equally, it is evident that a “best” test will choose 
So so as to minimize Po,(iSo) + 1 — P^^So). It is well known that So will 
minimize the indicated quantity if So consists of all points x such that g{x \ 6o) > 
g{x 1 0i). Thus we may speak of the region So defined by g{x \ do) > g{x j 6i) 
as an optimum equal risk acceptance region for 9o against 9i . Now if we transfer 
our attention to the general estimation problem we see that the maximum 
likelihood estimate &{x) is that value of 6 which would be accepted by the op- 
timum equal risk acceptance procedure against all other 0’s. 
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A NOTE ON REGRESSION ANALYSIS 

By Adbaham Wald 

OolumUa University 

1. Introduction. In regression analysis a set of variables y, , • • ■ , Xj, 
is considered whore y is called the dependent variable and Xl , ■ • • , a;,, are the 
independent variables. I.et denote the ath observation on y and a:,„ the 
ath observation on , (i = 1, ■ ■ • , p; a = 1, ■ ■ • , A). The observations a:,„ 
are treated as given constants, while the observations j/i , • ■ ■ , i/y are regarded 
as chance variables. The following two assumptions are usually made concern- 
ing the joint distribution of the variates j/i , ■ • • , j/y : 

(aj The variates yi, ■ ■ • , are normally and independently distributed with 
a common unknown variance c. 

(b) The expected value of y^ is equal to j9i*i£. + • • • + PpXp^ Avhere /3i , ■ • , 
dll are unknown constants. 

In some problems it seems reasonable to assume that the regression coefficients 
ft 1 ■ ' ’ , dp are not constants, but chance variables. This leads to a different 
probability model for regression analysis and the object of this note is to discuss 
certain aspects of this model. In what follows in this note wo shall moke the 
following assumptions concerning the joint distribution of the chance variables 
Vi ) * ' ' I V fft ft j * * ’ 1 dp • 

Assumption 1. For given values of di , • ■ • , dp the joint conditional prob- 
ability density function of yi , ■ • • , j/y is given by 

1 r 1 ^ "1 

(I'l) 1 ~2(r* di^ia ' ’ ’ dp ®po) 

Assumption 2. The regression coefficients di j • • ■ , dp are independently 
distributed. 

Assumption 3. The regression coefficients di i •••, dr , (r < p), are normally 
distributed with zero means and a common variance o■'^ 

The purpose of this note is to derive confidence limits for the ratio — . Such 

confidence limits have been derive’d by the author [1] for analysis of variance 
problems assuming that there arc only main effects but no interactions. The 
regression problem treated in the present note is much more general and in- 
cludes all the analysis of variance problems with or without interactions as 
special oases. 

It should be remarked that Assumptions 2 and 3 do not exclude the case where 
dr+i 1 ■ • ' I dp are constants. 


2. Derivation of confidence limits for the ratio ^ Let h , ■ ‘ , bp he the 
sample estimates of di i ' ■ ■ , dp obtained by the method of least squares. We 
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shall denote the difference - (3i by , (i = 1, • ■ • , p). It is known that for 
given values of /3i , ■ ■ , dp the conditional joint distribution of «i , ■ • ■ , is 
normal with ze.ro means and variance-covariance matrix 1| c.j || where 

(2.1) II Co 11 = 11 a., 

and 

ft 

(2.2) u„ = S (f, i = 1, • • • , V). 

Since the conditional distribution of fi , • • • , does not depend on the values 
of di I ' ' ■ > dp ) the unconditional distribution of ti , • • • , is the same as the 
conditional one, and the set of variates (di , • • , dp) is independently dis- 
tributed of the set (ci , • • , tj,). From this and Assumptions 2 and 3 it follows 
that !h I ■ • ■ 1 (v have a joint normal distribution and that 


(2.3) 

Eb, = 0, 

{l = 1, ■■ 

’■ ,t) 

and 




(2.4) 

Ebihi = U, -h 

(h i = 1, • 

• , r) 


where Bn = 0 for i 5^ j and = 1 for f = j- 
/2 

Wo shall denote — by X and the elements of the inverse of H c„ + S.jX H by 
<r2 

d<i(X). 

(2.6) II II = II c.y + Bij\ II \ (») J = 1) • • ■ > »■)• 

Then the quadratic form 

(2.6) Q(X) = -jS 


has the x"* distribution with r degrees of freedom. 

It is known that for any given values oi Pi , • • ■ , , ■ ■ • quadratic 


form 

1 ^ 1 , nS 

(2.7) Qa - S (l/a “ (’1^1“ - • • • - KXpa) 

^ ' (T* anl 

has the x“ distribution ^vith N -v degrees of freedom provided that the rank 
of the matrix H || is p. Hence Q. and Q(X) are independently distributed 

and the ratio 

„ N — p Q(h) 

/o Q^ 1' = — ^ 7=r~ 


has the F-distribution Avith r and iV - p degrees of freedom. 
Let Fi and Fj be two values chosen so that 

(2.9) Prob. {Fi ^ F ^ Fzl = c 
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where c is a given positive constant less than 1. Then the set of all values X 
for which the inequality 

( 2 . 10 ) I<\ < 9^ < Fi 

T Ha 

holds forms a confidence sot for X with tlie confidence coefficient c. 

We shall now show that Q(X) is a monotonic function of X and, therefore, 
the confidence set detennined by (2.10) is an interval. Let H (7,7 1| , (i, j = 
1, ■ • ■ , r), be an orthogonal matrix and lot 

( 2 . 11 ) = 

/-I 

It then follows from (2.3) and (2.4) that 


(2.12) 

E{h*i) = 0, 

(i = 1, ■ 

■ • , r) 

and 




(2.13) 

E{hWt) = (c*/ + S^iX)<r^ 

(», y = 1) • 

• • , r) 

where 




(2.14) 

On ~ OaOilOhl- 

lavl himti 



Let 




(2.16) 

II d</(X) II = li cj# + 5<yX ir\ 

(i, i = 1, • 

• • , r) 

and put 

Q*(X) = is2 4(X)b?b;. 

O’ 




It is easy to verify that Q*(X) is identically equal to Q{\). Hence, to prove 
the monotonicity of Q(X), it is sufficient to show that Q*(X) is a monotonic func- 
tion of X. Since no restrictions as to the choice of the orthogonal matrix |1 H 
are made, we shall choose it so that the matrix \\cij jj becomes diagonal, i.e., 
c* = 0 for i 7"^ j, (x, j = 1, • ■ • , r). Then 

(2.16) d*/(X) =0 for i 5^ j 

and 

(2.17) d«(X) = T^. 

Cti -T X 

Hence 

(2.18) Q(X) =-Qnx) = \ 

a-^ Cxi + X 

is a monotonically decreasing function of X. The confidence set determined by 
(2.10) is, therefore, an interval. 
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The upper eml point of the confidence interval is the root in I of the equation 


(2.19) 


N - p 0(X) ^ 
r Qa 


and the lower end point is the root in X of the equation 


(2.20) 


N -P Q(X) „ 


If equation (2.20) has no root, the lower end point of the confidence interval 
is put equal to zero. 

EEPERENCE 

[1] A. Wald, “On the analyaia of variance in case of multiple claBaifieations with unequal 
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ON THE SHAPE OF THE ANGULAR CASE OF CAUCHY’S 
DISTRIBUTION CURVES 

By Aurel Wintner 


The Johns Hopkins University 

1. Let I be a linear random variable, that is, a random variable capable of 
values X represented by points of a line — «> <x < », and suppose, for sim- 
plicity, that ^ has a density of probability, f{x). Then, subject to provisos of 
convergence, the series 

F{x) = £ 

nw— eo 

represents a periodic function, of period 1, having the following significance: 
F(x) is the density of probability of the angular random variable, say S, which 
is obtained if all the states 

f-2. f-1, ?+l, 


of the linear random vai-iable are identified. 

In other words, if a circle of unit circumference rolls from - to co on the 
Mine, then every point of the circumference collects the various densities of 
probability attached to congruent points of the ^ 

seats a point of the circumference. For a detailed study of the mappmg f - m 

^’'i^ofdii to' Poisson’s summation formula, the Fouiier constants of the 
periodic function F(x) can be obtained by restricting u m g(u) to an ®'l'“dist^t 
LqLce d discrete values, where g(u) denotes the Fourier transform of /(.) , 
cf., e.g., [6], p. 78 or [9], pp. 477-478. 
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2. Consider, in particular, tlie case in which /(a;) is the density of a sjnnmetric 
distribution which i s stalde in Cauchy’s sense . The determination of the totality 
of these linear densities of probability is due to L6vy [G], It was shown in [8] 
that every such /(a;) == /(—a:) is a decreasing function of | a: (. As explained in 
[8], p, 70, this fact makes superfluous one of the axioms occurring in Gauss’ 
postulational approach to "errors of observation.” 

The purpose of the present note is the deduction of the angular analogue of 
the fact just quoted. The analogue states that, if /(x) is symmetric and stable, 
then the corresponding periodic F{x) is decreasing for 0 g x g ^ (and so, for 
reasons of symmetry, is increasing for ^ g x g l). This is contained in the 
italicized statement of §4 below. 

In view of Poisson’s rule, quoted above, the periodic densities in question can 
be defined by certain Fourier series representing generalizations of elliptic theta- 
series. From this point of view, not even the existence (i.e., the 'posihvity) of 
the periodic den.sities is obvious, if arbitrary values of the "precision constant” 
(denoted below by q) are allowed. The difficulties involved are explained in §3. 

3. If q and X are positive constants the first of which is less than 1, then the 
(even, periodic) function 

eo 

(1) 6y.ix ; ?) = 1 -I- 2 2 cos nx, 

where g"*' > 0, has derivatives of arbitrarily high order at every real x. It is 
regular-analytic at every real x if and only if X > 0 is replaced by X 1, where 
the sign of equality holds if and only if the analytic continuation (from the x-axis) 
is not an entire function. In fact, it is known that a Fourier series 
2(0^ cos nx -f 6n sin nx) is that of a function which is regular-analytic at every 
teal X, and has the period 27r, if and only if | o„ | ] is majorized by a con- 

stant multiple of the nth power of a positive constant which is less than 1; 
and that the latter constant can be chosen arbitrarily small if and only if the 
analytic continuation docs not lead to any singularity (at a z «> ) . 

Since the function (1) tends to 1 uniformly in x as 3 — > -pO, if X is fixed, there 
belongs to every X > 0 a positive 3* = 3*(X) having the property that 

(2) ;3)>0for0^x<27r 

if 0 < 3 < 3’''(X) . It is loss obvious that, if 3 is sufficiently small with reference 
to X, say if 0 < 3 < 3**(X), then 

(3) 0\(x ; 3) is decreasing for 0 ^ x ^ ir 

(hence, increasing for tt ^ x < 27r). The existence of such a 3**(X) < “ for 
every X > 0 can be assured as follows: 

If s„(x) denotes the nth partial sum of the Fourier series S(sin nx)/n, then 
s„(x) is positive for 0 < x < t (Gronwall, Jackson; for a short proof, cf. [4]). 
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Hence, a partial summation shows that the sum of a sine series, Sb„ sin nx, 
must be positive for 0 < .-c < tt if 

nb„ — (n -f- l)b„+i > 0 and nb^ —>■ 0. 

Since the first derivative of (1) (with respect to a:) results by choosing 
b„ = ~-2nrj'' , it follows that (3) must be true if 

nY - (n+ > 0 

holds for n = 1, 2, • • ■ . But the last inequality is readily seen to be satisfied 
from n = 1 onward if, while X is fixed, q tends to 0. This proves that g**(X) 
exists for every X > 0, 


4. From these deductions alone, it is qmte imexpected that (the best values of) 
both g*(X) and g**(X) turn out to be independent of X when 

(4) 0 < X ^ 2, 

i.e., that (1) satisfies both (2) and (3) for 0 < q < 1, if (4) is assumed. This 
fact is o£ statistical significance, since, on the one hand, it is precisely the restric- 
tion (4) which is necessary and sufficient for the existence of Cauchy’s (sym- 
metric) "stalile” distributions (of [6], pp 254^-263) and, on the other hand, 
the reduction (mod 2ir) of the densities of these linear distributions leads to 
the functions (1) as angular densities (of. [9], pp. 477-478) ; the numerical value 
of q( < 1) being determined by the “precision” or “dispersion” of the resulting 

angular distributions. , / 

Under the necessary restriction (4), the linear analogue of g*(X) - 1 and of 
q++(\) = 1 was proved in [0], pp. 258-263 and in [8], pp. 71-77, respectively. 
It will remain undecided whether the restriction (4) is necessary in either of 
the angular cases. 


5 . Suppose that X has a fixed value in the range (4), Then there exists a 
monotone function of t, say oi\(t), for which 

exp ( — M^) = [ (~u i) da\(t) 


is an identll,y in u, nhore 0 < » < « (of. HI, P- 769 where further referencea 
will bo found) . Hence, a change of variables shows that 


= jf g*"’ d«x(t 1 log e 

is an identity in g and n, where 0 < g < 1 and n = 0, 1, 2, 
variable is t). Consequently from (1), 


fl. f-r -/y) = f diCx ; g*) daf.(t j log g 1‘ 


(the integration 
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where 0 < < 1 and — « < x < <» . In fact, the legitimacy of the term-by- 

term integration is obvious from 0 < g < 1 and dax § 0 (even though the inte- 
grals are improper) . 

6. Since ax is a non-decreasing function, it is clear from the last formula lino 
that both (2) and (3) will be proved for 0 < g < 1 and for every X (satisfying (4)) , 
if it is ascertained that both (2) and (3) hold for 0 < g < 1 when X ~ 2. But 
the case X = 2 of (1) is an elliptic theta-function, for which both properties in 
question (cf. the diagram in [3], p. 44) arc known; a simple proof can be con- 
cluded from what, in Hecke’s terminology, is the Eulerian factorization of 
9t{x ; g), as follows: 

According to Jacobi, the factorization of the case X = 2 of (1) is 
0^(a: ;g) = ft (1 - g’-Kl + 2g^"-‘ cos x 4- g^"-') 

riaol 

(cf. [7], pp. G4-05). Thus 

fl,(x ; g) = c, n P(x -f tr ; g‘"~^), 

Hat 

where 

r, = ft (1 - g*") 

n-l 

^nd 

(6) Fix ;r) = 1 — 2r cos X -f r“, (0 < r < 1), 

hence 

P(x;r) >0 (0 < r < 1). 

Since 0 < g < 1, this proves the case X = 2 of (2). Furthermore, logarithmic 
differentiation of the product representation of di^x ; g) gives 

eUx ; g) = 8,{x ; g) ft P'{x + ,r ; g’"-^)/P(x 4- tt ; g’""^), 

where/' = dj/dx \ so that, by (6), 

P'(x 4- w ; r) = — 2r sin x. 

Since 0 < g < 1, the last three formula lines and the case X = 2 of (2) imply that 

O'lix ;g) <0if0 <x <7r, 

as claimed by the case X = 2 of (3). 

This completes the proof of the italicized assertion. 
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A NOTE ON THE FUNDAMENTAL IDENTITY OF SEQUENTIAL ANALYSIS 

By G. E. Albbbt 

U. S, Naval Ordnance Plant, Indianapolis 

1. Introduction. Let U,) . (t = 1, 2, 3, • • •), be a sequence of real valued 
random variables identically distributed according to the cumulative distnbutmn 

function Fiz). Define tlie sums + ^^ + • • • + f 

integer N Choose two positive constants a and b and define the random vari- 
n?rns'the sLllest integer N for which one of the inequalities Zw S a or 
zl g -bholds. The notations P(u I F) andE(« 1 F) will denote the probability 
of u and its expectation respectively assuming that F is the distribution of the z.. 
Wald [1] has established the results contained m the following lemmas. 

Lemma 1. If Ihe variance of F(z) is posilm, P{n < » F) one. ^ ^ 
Lemma 2. If there exists a positive number 6 such that Pie ^ IF) > 

and P(p’ > 1 + 6 1 F) > Q and if the mommt generating function ^{(j - Fie J F) 


( 1 ) 




.WJd'.».Al..l.o, 0) to bo Vild for .11 W°r. 

on a certain interval of the real axis. 
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is valid and may be diffcrenliaied with respexl to i under the expectation sign any 
number of times for all real mlw.s of t. 

PiiooT, The notation k will be used consistently to denote the t value at which 
<p{t) has its minimum. 

The proof of the theorem folhiws Wald’s mtdhods quite closely and certain 
of the results given in [1] and [2] will be used here wilhout discussion. 

Consider first the validity of (1). For an arbitrary positive integer iV let Py 
be the probability Pin ^ | F) and let | F) and \ F) denote the 

conditional expecl aliona of u subject to the respective conditions n ^ N and 
n > N. Wald [1] has sliown that for any finite real value of i 

(2) PMe’‘’'\<p{i)r I A'l + (1 - 1 F} = 1. 

Since lira Pi/Etf { [^)(0r"exp(2„i) ) is the left member of the identity (1) , it Bufifices 

ATbuOQ 

to demonstrate that 

(3) lira (1 - PMOV'' 1 F] - 0 
for all real values of t. 

Since 1 — Pn tends to zero with increasing N and the expected value E'^ 
involved in (3) Ls bounded independently of N for any fixed f, the only source of 
difficvdty in proving (3) lies in the fact that tp(,t) may be lc3,s than unity on an 
interval of the real axis. Tliat difficulty is easily avoided by the following 
device. Define the function 

(4) G(x) = [^«o)rre"”dF(z). 

J— DO 

Obviously (?(x) is a distribution function whose moment generating function 
at) exists for all real (. Its mean is zero and its variance is positive as will be 
seen from the equations E{x \ G) ~ ip'iU)/<p{U) and F(x“ [ G) = <p"(.k)/(pik). It 
follows that is never less than unity for real values of t. 

Let Q denote the space of all Zi, , • • • , and let il{n > N) ho that subset of 12 
on which n > N. One has 

[ e^^UFizd-'-dFizjf) I d(?(zi)--‘d(?(z^) 

^ ^ ’'ll(n>^) 

f dF(zi) ■ ■ ■ dF(z^) f dQizf} • ■ •dG(z„) 

Jn >'0 

where a = t ~ ia and Qtr = P(n S N\(j}, By Lemma 1, 1 — Q;v tends to 
zero as is increased. Thus, since ^(s) ^ 1 for all real t and the expected value 
I G] is bounded independently of N for a fixed t, the equation (3) holds 
for all real t 
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Thn difforentiability clause of the theorem requires the following modification 
of a very powerful theorem due to Charles Stem [3], 

Lemma 3. Under the conditions of Lemma 2, if the minimum vik) of <p(t) is 
less than unity, there exists a positive nuniber k such that 


(6) E\bxp [nil - n log \F} < oo , 

PnooP. If 0 is the distribution of the «. , by Stein’s theorem there exists a 
positive number h such that | (?) is finite. Let = N) denote the 
subset of 12 on which n = N. Then 


P(n = 7^ I G) = f dG(zi) • • -dGiz:,) 

JD ( n - K ) 

= f dFizi) ■ ■ ■ dFM 

•^ncn-ir) 

^ P(n = iV|P) exp [minja/o, - 6(ol - A?' log (>j((o)]. 

It follows that 

P{oxp [nti — n log (fiito)] I P) S Pfe"'* | G] exp [- min(a2o , - bk]\ 
and the lemma is proved. 

To continue with the theorem, Wald’s proof [2] suffices for the case m which 
(pik) 1. Attention will be given only to the case ip(fo) <1. As pointed out in 
section 2 of [2], the diGcrcntiability clause of the theorem will be established if 
it can be shown that for any finite interval I of the real axis and any pair of 
integers n and ra there exists a function PnrjC^n , n) such that for all t in I 


one has 


(6) 

Dr,niZ.,n) ^ |u^*Z;'eV[p(i)p 

and 


(7) 

E[Dr,r,iZ„,n) IP) < 00. 


On referring to Wald’s proof and using the inequality -log <p{t) ^ -log <p{ti) for 
all t in I, it is seen that there exists a constant C and a positive number k such 
that the function 

satisfies (0) for all t in L To establish (7) use the inequalities (2.4) and (2.6) 
in Wald [2] to obtain 

P(A,,.(^„,n)|Pl 

= Cj:P(n = N\F)mk)r‘'Er.^s I 

J /-1 

g (7 { e"'* Z(< 2 ) + Z( - fa) ) P ( exp [n log n - n log p(fa)] | P 1 . 


( 8 ) 
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That (7) is indeed satisfied niw follows from (5) and the liniteness of the function 
l(t) since for a large enough integer Af one has 

Pin ~ N 1 F) exp [n log N — N log wOo)] 

g S Pin = N'\F) exp [Nli - A>' log .pC/fo)] < oo. 

Thus the expected value on the extreme right in (8) is finite. This completes 
the proof of the theorem. 
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A SIGNIFICANCE TEST AND ESTIMATION IN THE CASE OF 
EXPONENTIAL REGRESSION 

By D. S. ViLLAnsi 

Umied Slates Rubber Company, Passaic, N. J, 


1, Introduction. The principal problem under consideration in this note 
may be described as follows, Consider a variate, z, tvhose distribution for a 
given value of a fixed variate, t, is: 


( 1 . 1 ) 




—(*— 0 * / 2 ff * 


where a,b, and k are real-valued parameters. The regression of z on f is exponen- 
tial, for it follows from (1,1) that the expected value of z, given I, is: 

(1.2) ' j0(z 10=®“ 


On the basis of a random sample Ow(zi , h ; zj , • j Z/, , (y) it is desired to 

test whether /e = 0 or ». The problem of “fitting” a curve, z = a — 6e“**, 
to the sample (i, o. of estimating a, b, and k from the sample) will also be treated. 
As an illusti'ation of how the statistical problems described above arise in 


* Present address, Jersey City Junior College, Jersey City, N. J 
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prac:lico, let us consider a typical situation in industrial chemistry Let the 
quantity, be a property of a latex and let the quantity, t. be time SupposT 
furthermore, that measurement of t is without error but that measurement of z 
iS i^ubjeet to error; let it be assumed that the observed value in a measurement of 
^ (Gaussian) distribution about the “true value ” 

i(g). On haws of iV independent measurements, z,,z,, • ■ , of * at times 
h , h , ■ ■ ■ I /// , respectively, the experimenter may wish to test the hypothesis 
that /i; “ 0 or CO . If this hypothesis is true the suspected exponential relation 
between 3 and i does not hold; in Uiis case B(z) is a constant (a ~ h, or o) and 
estimation of the constant from the data is quite straightforward. If the data 
conflict with the hypothesis that ^ = 0 or «, the experimenter may wish to 
estimate the parameters, a, h, and k {i. e., “fit” the curve, z = a - he~’“, to the 
data) . ’ 

T, he problems considered in this note will be treated only for the case where iV 
is an even integer (> 6) and the times fi , fe , • • , iy at which measurements of 
z are made arc such that 


( 1 . 3 ) Ua — ha-i = A, a constant, (a = 1 , 2, • • . , n = iV/2). 
The odd time intervals, k~ k,h~ k, etc do not have to be equal. 


2 , Test of the hypothesis that A = 0 or co. The space, say R, of admissible 
values of the parameters in ( 1 . 1 ) is: tr’ > 0, - » < a, b, k < + » , Under the 
null hypothesis the admissible values of the parameters he in a subspace of R, 
say CO, spocifiod as follows: cr” > 0, — «> < a, 6 < + «>, /c = o, or oo 

Let 2/j = 83a and Xj = z^a-i , (a = 1, • • • , n = N/ 2 ). From (1.1) and ( 1 . 3 ) 
it follows that the n pairs xj ,y^ are normally and independently distributed with 
common variance, cr“, that xj and yj are independent {j = 1 , 2 , ■ , n), and 
that 

(2.1) vj = h + muj 

where vj = m = E{X}), h = a{l - and m = The space, 

R', of admissible values of the parameters in the joint distribution of x , , y , , 
(j = 1, • • • ) ’0) is: 0'“' > 0, ry = A + mu , ,— «> <A<+«>,— «> </i, ,r, < 
+ «> ; 0 ^ jn < <» . The subspace of R', say w', associated with the null hypoth- 
esis is; '> 0 , Vj == nj — c, where c = a — b or a according as A = 0 or «> 
In R', the expected values of x and y lie on a line, in co' they lie in a single point. 
It is clear that by transforming the original sample 0;v(2i , k, • • , zjf , ty) to a, 
sample 0„(a!i , yi ; • ■ ■ ; a;„ , y„) we have reduced the original problem to the 
familiar problem of linear regression in which there is "error in both variates”. 

The slope of the “line of best fit” to the sample points (x, , yi , • ■ ■ , , y^) 

is [1]: 


( 2 . 2 ) 


7h = [Suu- S., + ViS,, - + iSk]/2S^ 
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where 

2 (a;, “ 

I 

^ L (■<:, - x){yj - S>) 

1 

n 

S]/y ^ 0/; |/) 

1 

>1 

f £r X; aii/n 
1 

n 

s' ® 2 yy/ra 

1 

(vi is an estimate of in in (2.1)). Since ni = e''‘^ (where k and A are real),^t is 
intuitively clear that when m is non-positive the sample 0,, does not conflict 
Avith the null hypothesis. The null hypothesis can be tested by moans of^the 
statistic [2, 144] 

yo “h 2mSt ,/ ~j~ ^ S j/y 

m’iS,, - 2mS«„ + S,, ' 

The null hypothesis is rejected if rh is positive and F' is large. Percentage points 
of tho distribution of F' are given in [2, 140] forn = 3 (1) 15 (5) 30, 40, GO, 120 
and for significance levels, 0.001, .01, .05, .10, and .20. These significance 
levels, however, were computed for use in cases where tho sign of m was irrele- 
vant, It happens that to test the null hypothesis under consideration in this 
problem at a significance level a we should use a critical value of F' (given in 
[2]) corresponding to a significance level 2 a. The reason for this is that when 
the null hypothesis is true the quantities m and F' are independent and the 
probability that m is positive is ^ — thus the chance of rejecting tho null hypoth- 
esis is J(2a) = a. 

3. Estimation of a, h, and k. If the data do not support tlio hypothesis that 
/c = 0 or =0 , the experimenter may wish to estimate a, 6, and k. General alter- 
native methods of estimating these parameters ivill now be considered. 

(1) Estimate a, b, and k from 0^ by the method of least squares; i.e., solve 
the simultaneous equations dS/da = 0, dS/db = 0, and dS/dk = 0 for a, b, 
and k, where 

(3.1) 6^ = £ (a< - o -h 6c-*")“. 

t-l 

Tho value of k obtained by this method of estimation will not in general be the 
same as that computable from m in (2.2) and used for the significance testing. 

(2) Estimate k by means of (2.2) and the relation m = then substitute 
this estimate into S of (3.1) and estimate a and h by means of least squares. 
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(3) E t,mate /c as m (2) ^cl choose, as an estimate of a, the intercept of the 
“hnc of Lest ht ’ for 0„ . Then substitute these estimates of a and k into (3 1) 
and estimato b by means of least squares. In this case the estimate of 6 come 
out to be: 


(3.2) h = i: 

whore & anil iG are the estimates of a and k. 

If the values, h ,U , • • • , are such that k+i — <» = A, (t = 1, 2, • • • , — l), 

the following estimation procedure might be used. 

(4) Let 


Vi “ 2,4.1 

(f = 1, 2, , JV - 1), 

= z< 

and treat the (AT - 1) pairs of values (aii , j/i , • • • ; a;„_i , 2 /^_i) as a sample of 
size {N ~ 1). Using this sample, estimate k, a, and b m a manner similar to 
that in (2) or (3) . It should be noted that this sample is not a random sample 
owing to the dependence among the (iV — I) elements. 

Tho procediu'o in alternative (1) is very laborious and time-consuming. The 
procedure m (2) and (3) can be carried out quickiy and easily. In (1) the 
method of least squares yields the same results as would be obtained from appli- 
cation of tho method of maximum likelihood. Examples of estimation by proce- 
dures (3) and (4) are given in the next section. 


4. Example. The accompanying table lists experimentally observed values 
of a property of a latex obtained at biweekly intervals. Using the first, third, 
etc., quantities as a:y and the remaining ones as yy , the sums of squares and prod- 
ucts of deviations ai-e found to be: 

= .036610 ^ = 09195 


= .026645 

S,y = .023414 y = .9365. 


Substituting these values in equation (2.2) and computing the other constants 
from equation (2.1) wc get: m = 0.791596, a = 1.0009, and k = 0.1168. The 
F' ratio is (2,3) 17.03. Entering Table I of [2], we find that for eight point pairs 
a value of F' = 16.6 may be expected only one time in one hundred. On ex- 
cluding tho possibility of negative values of m, this corresponds to the 0 5% 
significancQ level. The exponential relationship is thus concluded to bo highly 
significant. 

Evaluation of b by equation (3.2), method 3, gives 0.2560, if all 16 values are 
used. The equation calculated from the data is thus: 


(4.1) 


z = 1.0009 - 0.2560 e 


- 0.11681 
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The alternative procedure, method J, would bo to use all the z, points for the 
estimation of a and k. This leads to the following values of the computation 
quantities: 

i« 

xte = 0.052374; £ = 0.9223 

16 

^xv = E = .036924 

la 

Ex" - = .036436; S = -9381. 

Note that the rlifferencc jSxs used in the formula for m cancels out all inter- 
vening squares between the first and last. 

^xx ” a:i 


TABLE I 


i 

weeks 

Xi 

L 

weeks 

Xi 



i 

weeks 


1 

.776 

9 


17 

.942 

26 

.966 

3 

.862 

11 


19 

.938 

27 

.093 

6 

,860 

13 

.930 

21 

.979 

29 

.985 

7 

.869 

16 

.948 

23 

.975 

31 

1.013 


Hoivever, the data excluded thereby are in eflPect included in the new Sn / . 

The final values obtained by the fourth procedure are; m = 0.796690, a = 
1,0000, and k = 0.1137. The writer does not know whether the peculiar trans- 
ference of data from to characteristic of procedure 4 improves the 

accuracy of the fit or hurts it. It is his personal preference to use procedure 3, 

6. Acknowledgement. The writer wishes to acknowledge with thanks his 
gratitude to Drs. T. W. Anderson, Jr, and David P. Votaw, Jr. for many sug- 
gestions and discussions concerning this problem and for much help in clarifying 
the presentation of the concepts. 
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ON THE POWER EFFICIENCY OF A t-TEST FORMED 
BY PAIRING SAMPLE VALUES 


By John E Walsh 


Princeton Umverstty 

1. Introduction. Consider two equal sized samples, one from a normal popu- 
lation with mean m and the other from a normal population with mean r Let 
Xi , • ■ ■ , be the sample values from the population with mean m and wi , ■ • ■ y 
the values from the population with mean v. If the two populations have the 
same variance and the two samples are independent, the most powerful tests 
for comparing ix and v using these samples (one-sided and symmetrical two- 
sided) arc based on the statistic 


— [x — y ~ in — v)]\/n(n - 1) 

]/ 12 (Xi - r)* + S (y, - yf 

which has a Student ^-distribution with 2n- 2 degrees of freedom. Tests based 
on also have .the desirable property of being invariant under permutation of 
the data in each sample 

Sometimes, however, it is useful to combine the sample values in the form 

Zi = {xt — y^, ii = \^ ... ^ n). 

Examples ; 

(a) . When the samples are independent but it is not known that the two popu- 
lations have the same variance (Behrens-Fisher problem) . 

(b) . When there may be correlation between r, and y, , (t = 1, , n), 

this correlation being the same for each value of i (i.e. x, is independent of yj 
a i j while each pair r, , j/i , (i = 1, • • • , n), has the same normal bivariate 
distribution) . 

In both (a) and (b) the are independently normally distributed with the 
same variance and mean n — v. 

The Student i-test for comparing n and v using the z, is based on the statistic 
^ _ [S — in — v)]Vnin - 1) ^ [x — y - in - v)Wnin - 1) 

S (2< - 2)* ~ V' ~ ~ 

which has a Student ^distribution with n — 1 degrees of freedom. These tests 
are not invariant under permutation of the data m each sample. 

If it is true that all the sample values arc independently distributed with the 
same variance c*, efficiency will be lost by using the test based on h instead of 
the most powerful test based on U . The purpose of this note is to determine the 
power efficiency of the tests based on k as compared with the corresponding 
tests based on U for this case. 



TABLE I 


Power Function Values for ike ti and h Tests 


Test 

n. 

^iliro.x 

Kmciency 

U 

Aijprox. Values of Power Function 

5 ^ 

6 

i = H 

« •= 2 

fcl 

0 

87% 

.05 

.270 

.074 

.933 

.994 

ta 

5.2 


.05 

.275 

. 072 

.932 

.994 

ti 

0 

82.5% 

.025 

.159 

.480 

• .822 

.970 

tall 

mmm 


.025 

.100 

wm 



tl 

8 

90% 

.05 

.355 

.812 

.985 



7.2 


.05 

.354 

.813 

.985 


ti 

8 

80.5% 


.220 

.074 

.952 

.998 

ta 

G.9 



.225 

.075 

.951 

.998 

ti 

8 

82% 

.01 

.112 


.813 


ta 

(1.55 


.01 

.112 


.842 


ti 

mi' 

92% 

.05 

.425 

.898^ 

.997 


tj 



.05 

.426 

.897 

.997 


ti 

10 

90% 

.C'26 


.81-2 

.988 


tj 

9 


.(/25 

29;) 

.8' 3 

.988 


ti 

10 

85.5% 

,01 

.159 

.020 

,950 

.999 

ta 

8.66 


.01 

.159 

.027 

.950 

.999 

tl 

15 

96.5%, 

.05 

.579 

■1 



ta 

14,3 


.05 

.579 

■1 



tl 

■mjjl 

93% 

.025 

.437 

.950 

1.000 


ta 



.025 

.437 

.949 

1.000 


tl 


90% 

.01 

.278 


.998 


ta 



,01 

.278 


.998 


tl 

26 

98% 

.05 

.784 

.999 



ta 

24.6 


.06 

.784 

.999 



tl 

26 

96% 



.998 



ta 

24 




.998 



tl 

25 

94.5% 

.01 

.514 

.992 



ta 

23.7 


.01 

.514 

.992 
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Consideration is limited to one-sided tests, which is not a serious limitation 
since any two-sided test can be considered as a combination of two one-sided 
tests. Table II contains approximate power efficiencies of one-sided tests for 
n > 4 at the significance levels a — .05, .025, .01. 

It is found that the efficiency of the ii test increases with the sample size but 
is high even for small size samples. 

2. Outline of computations. The method of obtaining power efficiencies 
used hero will be that outlined in [1]. Essentially this consists in computing the 
power function for the test based on ti and then adjusting the sample size for 
the corresponding test based on U until its power function is approximately the 
same as for the U test. The ratio of the sample size (perhaps fractional) of the 
adjusted k test to that of the k teat is called the power efficiency of the k test. 
Intuitively this efficiency measures the fraction of the total available information 
which is being used when the k test is applied (ance the fc test is most powerful) _ 


TABLE II 

Approximate Power Efficiencies for Given n and a 


n 

a 

4 

6 

6 

7 

8 

9 

10 

15 

25 

CO 

.05 

.025 

.01 

82.5% 
77 %* 
73% 

86% 

80%* 

75.6% 

87% 

82.5% 

78% 

88.5% 

84.6% 

80% 

90% 

86.5% 

82% 

91% 

88.5% 

84% 

92% 

90% 

86.5% 

95.5% 

93% 

90% 

98% 

96% 

94.5% 

100% 

100% 

100% 


* These values were obtained by comparison with the corresponding values for 
^ ,05 fitid. .01 • 


It is easily seen from symmetry that a one-sided k test of M < r has the same 
power efficiency as the corresponding one-sided k test oi n > v. Thus it 
sufficient to consider the one-sided tests of /x > v. , , 

The power function is found as a function of the parameter 6, where 


S = 


rV2’ 


Most of the approximate power efficiencies were determined by 

normal approximation given in [2] to compute the the results 

approximation was used for fractional values of n. Table i contains 

of those computations for one-sided testa o* ^ n and = 05, .01 

Exact values of the power func lon^ of the power function values 

can be found from the tables m ,31. - 1 ■ values shows that, 
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tion of powor cfficicnpie.s, .'io that littlo error in power effioicncies would be 
expeeted if the approximation were uacd for a == fi, ot = .01 or n — 4, a == .05, 
the Gfficie.ncio.s given in Table II for n = 4, a = .05 and n = 4, 0, a = .01 were 
obtained from the exact values l)y graphical interpolation and croRs-iiitcrpolalion. 

Power efficiencies were, not coaside.rod for n < 4 beeanae, of the difficniltios 
of inhirpolation and the ine.xaetnt'RH of the normal approximation in thi.s range. 

l<’or n == <xi, h and la both have a normal distribution with zero mean and unit 
Yariane,e. Thus the, power efficiency is 100% at all significance, levels for 
this case. 

These computations furnish approximate power efficiencies for n = 6, 8, 10, 
15, 25, 00 at o! = .05, .025, .01, and forn = 4 at a = .05 and .01. The, other 
approximate power efficiencies listed in Table 11 were obtained by graphical 
interpolation from these values, 

The results of this note can be roughly summarized tor n < 15 by stating 
that of the 2a sample values 

(i) . approximately 1.6 values are lost at the 5% significance level, 

(ii) . approximately 2.1 values arc lost at the 2.5% significance level, 

(iii) . approximately 2.8 values arc lost at the 1% significance level, if the 
testa based on h are used instead of the corresponding tests baaed on k . Exami- 
nation of Table I shows that the number of sample values lost decreaaes as n 
increases for n > 15. 

IIKPERKNCES 

[1] John E. Walsu, "On tho power function of fcho sign test for slippage of moans", 

Anmla oj Math. SlaL, Vol. 17 (1040), pp. 360, 331. 

(2) N. L. Johnson and 13, L. Weikui, "Applications of the non-central t-distribution”, 

Bxomelnka, Vol. 31 (1940), p. 376. 

[81 J. Nbxman, "Statistical problems in agricultural experimentation”, Roy, Slat. Soc. 
Suppl, Vol. 2 (1936), pp. 131, 132. 


NOTE ON THE LIAPOUNOFF INEQUALITY FOR ABSOLUTE MOMENTS 

By Maukicb H. Belz 


The University oJ Melbourne 


For a variate x measured from the mean of the population, the absolute 
moment of order r is defined by 


rr 


f \x\’'dFix), 

J—V9 


where F{x) is the cumulative distribution function. Treating r as continuous, 
we have 


I a: r log. I a: I dFix), 

the integral on the right existing if vr+i exists. 



LIAPOUNOFF INEQUALITY 


605 


Write y = log, Then we have 
dy 


dv f 

^ I » r log. I a: I dFix), 

’'"l! = Ij^riogn^UFw -{£ i:rriog,|rE|dF(^)|* 

& 0, by Schwarz’s inequality. 



It follows that the function y is convex (or exceptionally a straight line), and, 
on referring to the figure, it appears that 

(1) MQ g MQ’ 

for all chords PR. If the abscissae of the points L, M, N are c, b, a, respectively, 
where c ^ ^ a, the inequality (1) leads at once to the relation 

log, Vi § log. l>„ H log, Va . 

Oi C 0/0 


Hence 

a— 6 ^ 0—6 b—c 

Vh ^ Vc Vo , 

which is the usual form of the Liapounoff Inequality 

REMARK ON THE NOTE "A GENERALIZATION OF 
WARING’S FORMULA” 

By T. N. E. Gbeville 

V. S. Public Health Service 

Before submitting for publication the note “A generalization of Waring’s 
formula,” Annals of Math. Stat , Vol. 16 (1944), pp. 218-219 the author made a 
diligent effort to ascertain, through correspondence with mathematicians and 
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actuaries both in this country and abroad, whether the generalized formula in 
question had been previously published, and none of the authorities communi- 
cated with knew of its prior publication. Howover, it has now come to his 
attention that the formula was published in o.sRentially the same form by Hermite 
in the article "Sur la formulc d’intcrpolation de Lagrange”, Journal fur die 
Reine und Angewandte MaOicmalik (“Crclle’s Journal”), Vol. 84 (1878), 
pp, 70-79. 
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1. Estimation of Parameters in Truncated Pearson Frequency Distributions. 
A. C. CoHKN, University of Georgia. 

Given a truncated univariate Pearson frequency distribution, parameters of the com- 
plete distribution are required. Karl Pearson and Alice Lee, {Biomeirika, Vol 6 (1916), 
pp. 69-09) and R. A Fisher, {Inlroduciion to Mathematical Tables, Vol. 1, British Assn. 
Adv. Sci.| 1931, pp. xxvi-xxxv), obtained solutions of the truncated normal distribution 
■with a single tail missing. The present paper presents three general methods of solution 
applicable to any of the Pearson distributions. The first utilizes moments of a higher order 
than are required to characterize corresponding complete distiibutions. The order of 
the highest moment required is increased by one for each missing tail. The second method, 
applicable when only a single tail is missing, utilizes the terminal ordinate at the point of 
truncation and moments of the same order as reqmred to characterize the complete dis- 
tribution The terminal ordinate is evaluated by successive approximations. The third 
method utilizes only tho first two momenta, but requires that the given distribution be 
further truncated and that moments be computed both before and after the additional 
truncations. This latter method can also be applied to complete distributions to avoid 
direct computation of third and fourth order momenta. 


2. Distribution of a Root of Determinantal Equation. D. N. Nanda, University 
of North Carolina. 

Tho joint distribution of the roots of a determinantal equation was given by P. L Hsu 
in 1930 and tho distribution of any one of the roots was studied by S. N. Roy The present 
paper, liowovor, gives a different method of working out the distribution of any root, 
epeoifiod by its place in a monotomo arrangement. This method enables us to express the 
distribution of a root of a certain determinantal equation in terms of a linear combination 
of products of incomplete beta integrals and m terms of the distribution of a root of lower- 
order dotcrminantal equations. 


3. The Power of Certain Non-Parametric Tests of Independence. Wassily 
HoepfdiNQ, University of North Carolina. 

Several tests of independence have been proposed which are based on statistics dependmg 
only on the ranks of the sample values. Under the hypothesis Ho of independence the 
distribution of such statistics does not depend on the form of the parent distribution. 
Two of those statistics. Spearman’s rank correlation coefficient and Lmdeberg-Kendall s 
statistic based on the number of inversions in the permutation of the ranks, are show to 
bo ttsvmntotioajly normally distributed in samples from any population (the hmiting or- 
ml dSu on bo4 sin^ilar in certain degenerate cases) . The asymptotic distnbu ion 
S Co ooff cTonts reveals that the corresponding tests of independence are mconsmtent 

tar.rrifth.p,ob.bflH^ 

true), and at least one of them IS biased m the limit ‘ 

aamplosizes and some sizesoftheontioal region there do not. \ ^ 

nendence based on rnnlcs. But there do exist rank tests ot m 

Ltent, and hence unbiased in the limit, Examples of such tests are given. 

4. Some Significance Tests for the Mean Using the Sample Range and Midrange. 
John E, Walsh, Princeton University. 
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Consider asamplo of size n, (2 < n < 10), diawn from a normal population with maan/i. 
TjOt 2 „ be the largest value and i, tlie Htnallesl value of the sample Hignilicance tests are 
developed to compare m with a given hypothetical value /in 'ly use of the sample, These 
signitioanoo teats are based on the quantity D «= Ufa", + s:„) — *io]/Ca;n ~ *0 ■= [(sample 
midrange) — (liypotlictical ineaiOl/fsamjilc range). One-sided and symmetrical testa are 
oonsidored. Values of Da such that Pr{D > D„ | — po) “a are computed for a = ,06, .025, 

.01, .008. Those values of Do cun be used to obtain one-sided tests at the .06, 025, .01, .006 
significance levels and symmetrical tesla at the 10, .06, .02, .01 significance levels. KfTi- 
cienoioB are computed tor one-sided tests at the .05 and .01 significance levels. The ofR- 
ciency is at least 90% tor a ^ 0 at the .05 significance level and for n < 8 at the .01 level. 
The range -mid range tost can bn applied without computation through tlic use of an easily 
constructed graph. The application of a test requires only the plotting of the samplo 
point (ii , a:,) on this graph. 

6. Testing Compound Symmetry in a Normal Multivariate Distribution. David 

F. VoTAW, Jr., Princeton University. 

Lot F(X) be the d.f . of a i-ordor vector variate X(t ^ 3) . Suppose the components of X 
are divided iuto mutually oxclusivo and exhaustive subsets. F(,X) is said to bo compound 
aynmelria, for the given division of its variates into subsets, if it is invariant over all per- 
mutations of its variates within these subsets. F(X) is complclely symmetric if the invari- 
ance holds over all permutations of its variates. If F(X) is normal and compound sym- 
metric, then within each subset of variates the moans are equal, the variances are equal 
and the covariances are equal, and between any two subsets of variates the covariances 
are equal. Testing hypotheses of compound or complete symmetry in a normal F(X) 
is of interest, tor example, in studying psychological examinations and in medical research. 

In this paper likelihood ratio criteria are developed for testing various hypotheses 
involving compound symmetry in regard to a normal distribution and to k normal dis- 
tributions (h S 2). Given that the corresponding null hypothesis is true, the moments 
of each oritorion are obtained explicitly and the distribution of each criterion is identified 
as the product of independent beta variates (in the oaso of a single normal distribution, 
the distributions ate given explicitly for f «= 3, 4, and 6 tor certain divisions of the variates 
into subsets). In a previous paper Wilks has given results on a very thorough study of 
the sampling theory of likelihood ratio criteria for various hypotheses involving complete 
symmetry in regard to a normal distribution. 

6. Effects of Non-Normality at High Significance Levels. Harold Hotell- 
ing, University of North Carolina. 

The effects of non-normality in the underlying population on the probabilities of sip- 
nificance by customary statistical tests are not well understood, in spite of numerous 
attacks, both mathematical and experimental, on the problem. Chung’s recent proof that 
the distribution of Iho Student ratio i has in samples from an arbitrary population a dis- 
tribution approaching normality for large samples tends to confirm the common idea that 
non-normality makes little diftorenoo if only the sample is fairly large, but this holds 
only for a fixed range of values of i while the sample number N increases, The tail areas 
beyond a deviation which increases with N in certain ways often behave quite differently 
than in sampling from a normal population. If p is the probability that | < | > fo in sam- 
ples of N from a normal population and p’ is the corresponding probability for another 
population, it is shown that lim may be zero or infinite or may take any 

fi^nite value, even when the non-normal distribution involved is of simple and realistic 
continuous forms. The conditions that this limit be unity are concerned only with the 
shoulders of the populatiim histogram, and have nothing to do with its moments or its 
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behavior at infinity or at its mean. This suggests that caution should he used in applying 
familiar teats with high significance levels; that further calculations should be directed 
toward making this caution quantitatively defimte; and that the use of sample moments 
or cumulanta cannot lead to the most appropriate criterion of non-normality for this 
purpose. 


7. On the Problem of Similar Regions. E. L Lehmann, University of Cali- 
fornia, Berkeley, and Henry Schepp^i, University of California, Los Angeles. 

If X = (Ai , • ■ , X„) la a act of random variables with a joint probability density 
depending ou a set of parameters 0 = (Bi , , 8m), and if T = (Ti , , Tm) is a set of 

Bufficiont statialica for 6, then Ncyman (Phil Trans Roy Soc London, Vol. 236 (1937), 
pp 333-380) has proved that a region w in the space of X is similar with respect to 9 if it 
has the following structure The intersections w(t) of w with the surtaces T = i have the 
property that the conditional probability of the sample point X falling into w given that 
T t does not depend on I, In the present paper a necessary and suflfioient condition is 
found for the regions with the above structure to be the only similar regions This con- 
dition is shown to be satisfied for a certain class K of probability densities which contains 
ns special cases all densities for which the totality of similai regions has been previously 
determined. In particular the partial differential equations which Neyman (Annals of 
Math. Stal., Vol. 12 (1941), pp. 4G-76) assumed were satisfied in his solution of the problem 
of similar regions are solved and it is shown that any density satisfying these equations 
belongs to the above class K 


8. Fourth Degree Exponential Function. L. A. Aroian and Marguerite 
Daiucow, Hunter College. 

It is shown that the fourth degree exponential function is supported by the Bernoulli 
probability function and the hyporgoometric probability function as well as being the 
function for which the method of momonts is the best method according to the criterion of 
maximum likolihoocl, In the general situation six moments, at most, are needed The 
function is claaairiod into two general groups depending on symmetry or asymmetry and 
each case is divided again into ummodal and bimodal distributions. Examples show that 
the function is very successful in graduating the mam Pearson types and the Gram-Ohariier 
Type A frequency function Various generalizations of the exponential function are 
indicated. In addition to its wide generality, the greatest practical advantage of the new 
system is the simplicity of the numerical calculations 


9. A General Weak Limit Theorem for Independent Distributions. 
University of North Carolina. (Read by title.) 


P. L. Hsu, 


For every positive integer n let there be n distribution funotion8jf’„i(a:), F« 2 (»), 
F«n(z). Assume that lim„-.» Max,^/Snll - F,„(x) + Fn,(-a:)) - 0- Let F(x) 


mil + 


r 


[e 


li* 


be the convolution Fi,i(x)*Fnt(^)* Let <l/(L) 

1 - ilx/(\ -k s’lKl + x»)/a;»£iG (s), with G(x) 1 andG(») - G(-«) < «, LetF(a:) bethe 
(infinitely divisible) distribution law having oxp vUO as its charactenstic function. 

In order to have lim„-,,P„(x) - F(x) at every continuity ^Xmtv 

and BufTicient that tlie following relations hold at every x > 0 such that ±X are continuity 

points of 0(v)-. 


(I) limn-^M 


,_1 J ]ul>fr ■' ll'l>“’ 



610 


ABSTllACra OF 1'AI‘KBS 


(II) 2 I f tf ~( f y dF^,{y)] \ f (1 + v«) dO (y), 

,~i I.J |v|>* \J \v\<x / J d li/Ki* 

lun„-*M [ ydF„/(y)^m+ f V dG(y) — f (l/v) dO(y). 

/ml |y|<a J jy] <■ ^ !i/l<* 


(III) 


10. On the Maximum Partial Suras of Sequences of Independent Random 
Variables. K. L. Ciiuno, Princelou UnivorsiLy. 

Tlic asymptotic behavior of the maximum partial sums of a aequonco of independent 
random varialdos is studied in this paper. Two groups of now limit theorems ate oslab- 
lishod under general conditions. The first group deals with theorems of the weak typo. 
The limiting distribution of the maximum partial aums is obtained with an estimate of 
the remainder, thus improving a recent result of Erdos and Kae. Another estimate is 
obtained for a different domain of variation, wliicb plays an essential lolo in the sequel. 
These results carresiiond to the sharper forms of the central limit theorem In the second 
group, the,oreinB of the strong type are obtained, giving precise lower bounds (in tlic souse 
of probability) for the maximum partial sums. Those results form the exact counterpart 
to tha ge,ne.ral form of tlio law of the itcialed logarithm, due to Feller, which give tho pro- 
oisa upper bounds. A summary of llio main results and methods has appeared in Proc. 
IVai. Acad, of Sci., Vol. 33 (1947), py 132-130. 


11. Some Results on. the Distribution of Quadratic Forms From Gaussian 
Stochastic Processes, (Preliminary report), IIkrman Rubin, Cowles 
Coramiaaon. 

If one conaidors tlio estimation of tlio parameters of a Gaussian stoohastio process 
whoso olomenls are continuous functions from tho functional values over a finite interval, 
one often finds that oorlain paramotors can be estimated exactly, and certain paittmotora 
can not. This result often depends on tho distribution of quadratic functionals whoso 
arguments are elements of tho stochastic process under consideration. In tills paper, It 
is shown that tha olomonts of a certain class of quadratic functionals havo distributions 
concentrated at a point, and that tlie elements of a different class do not; in this latter caso, 
tho charaotorisUo function is computed. 


12. Some Significance Tests for the Median which, are Valid under Very General 

Conditions. (Preliminary Report) John E. Walsh, Princeton University. 

(Read by title.) 

Consider n iiidepondont values drawn from populations necessarily satisfying only: 1) 
Each population has a unique median. 2) Tho median has tho same value v for each popu- 
lation. 3) Each population is symmetrical. 4) Each population is continuous. (It 
is to bo emphasised that no two of tho values are nocossarily drawn from tho same popula- 
tion.) Significance tests are derived for <p on tho liasis of l)-4). These significaneo tests 
are baaed on ordor statistics of certain combinations of order statistics, each combination 
being eilbcr a single order slatistio of tho w values or one-half the sum of two ordor statistics. 
The tests are invariant under permutation of tho n values and reasonably offioient if the 
values roproBont a sample from a normal population. The signifioanoo levels are of the 
form r/2", (r, = 1, • ■ ■ , 2" — 1) Each value of r can bo obtained for some one-sided signifi- 
cance test. Thus any significance level can be closely approximated if n is large. The 
major disadvantage of these tests is the limited number of suitable sigmfioanoe levels avail- 
able for small values of n. This disadvantage is partially eliminated by the development of 
testa which have a speci^ed significance level if the values are a sample from a normal 
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population and a sigiuficance level bounded near this specified value if only l)-t) necessarily 
hold ItosuUs based on l)-4) are applied to several well known statistical problems' 
Tests are obtained for the mean on the basis of a large number of independent values from 
populations having tho mean but little else m common Also generalized results are ob- 
tainod for the Behrens-Fisher problem, quality control, slippage tests, tho sign test and 
casGS whero some of the n values arc dependent 

13. Loss of Information in <-tests with Unbalanced Samples. (Preliminary 
Heport) John E. Walsh, Princeton University. (Read by title) 

Consider two normal populations, tho first with mean ai and variance o-J, the second with 
mean oj and variance va, while o-i/vj has a known value C If the hypothesis oi = Oi is to 
be tested by a l-test (one-sided or symmetiioal) using ni sample values from the first popu- 
lation and Ui values from the second population (m + wi = n, fixed) , it is shown that this 
experiment is most powerful when Jii/ni = vi/tri! (integer considerations neglected) The 
f-tests satisfying this condition will be refened to as balanced f-tests. Thus information 
will be lost by not using a balanced experiment A quantitative measure of the information 
lost by using given values of rii and 112 is determined by the total sample size m, (mi -t- m 2 = 
in) , 0 f the balanced 1-tost (same significance level) which has approximately the same power. 
Then n — m sample values are wasted by using {ni , ni) rather than (mi , m 2 ), 1 e. only 
100m/n% of the information obtainable per observation is used by (ni , 112 ). A sym- 
metrical f-test with significance level 2 a has the same value nf m as a one-sided i-test with 
significanoc level a. For one-sided l-tcsts with significance level a • m = + •VS” - 8A), 

whore B = 2 + A + IC /2, A = (C + 1)^1 - AV2(n - 2)l[CV»ri -t- IMl'S and K. is the 
standardized normal deviate exceeded with probability a. This approximation to m is 
valid for w ^ 6 if a = 05, tn ^ 6 if a = 025, m ^ 7 if a = 01, m > 8 if a = 005 (A 
fractional value of m represents an interpolated measure of the sample size of the equivalent, 
balanced experiment,) 

14. Some Theorems on the Bemoullian Multiplicative Process. T E. Harrir 
I’ rincoion University. (Read by title ) 

A single entity may have j descendents with probability P, , (j = 0, 1, 2, ■ •• ) , Each 
first generation entity has then the same procreative probabilities, etc. Let 

/(s) = Vo + Pi3 + ■ 

If z„ is the number of entities in the nth generation, it is known that P( 0 „ = 3 ) is given by 
the coefficient of st in the nth iterate/!/ ■ • (/)] = /n(s) Let Ezy = a;, 1 < a: < « . Con- 
ditions are given insuring that as n — » the cumulative distribution of the variate Zn/x" 
approaches a limit-function which is absolutely continuous except for a possible single 
jump. Letp(w) be the corresponding frequency function If /(s) is a polynomial of degree 
it.letg »= log*fc/(log.i; - 1). Otherwise.g = 1. Then ff(u)' exp liti+'l [is.isnot] summable 
(0, « ) according as e is [nagativo, positive] Behavior of g(u) near u = 0 is also considered. 
Speoinl oases are considered wherep(n) = constant Mil'"-i-e'""/”,mapositivemteger. Max- 
mum likelihood estimates for the parameters Po ,Pt , • ■ , and x are obtained as functions 
of n BUOCOSSivo values «i , zz ,■■■,«» • Consistency, in a certain sense, is proved. A 
specialized method is given for finding the moment-generating function of the variate AT, 
tho snmllost value of n such that z„ = 0 
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Readers air inaued lo submil la llic Rccielari/ of Ike Insiilutc news Hems of interest 

Personal Items 

Dr. (ieorge K. Albert has been appointed to an associate professorshi]) at the 
University of Tennessee. 

Dr T. W. .'Vnderaou, Jr. has been promoted to an assistant professorship in 
the Department of Mathematical Statistics at Columbia University. He is on 
leave the first half of the 1947—18 academic year at the Institute of Actuarial 
Mathematics and Mathematical Statistics, Stockholm University as a Guggen- 
heim Eellow During the second half of the academic j'car he will be at Oara- 
bridgo University. 

Associate I’rofcssor Max Astrachan has been promoted to a full professorship 
at Antioch College, Yellow hsprings, Dhio. 

Associate Professor T. A Bancroft, who has been at the University of Georgia, 
Athens, Georgia, is now with the Statistical Laboratory, Alabama Polytechnic 
Institute, Auburn, Alabama. 

Dr. M. S. Bartlett of Cambridge University has been appointed as Professor 
of Mathematical Statistics at the University of Manchester, Manchester, 
England. 'Pho position is a newly created one. Profe-ssor Bartlett indicates 
that tliis position is believed to be the first official professorship in mathematical 
statistics in England. 

Professor M. A. Brumbaugh has accepted a position with the Bristol Labora- 
tories Inc., Syracuse 1, New York. 

Dr. Donald A. Darling has been appointed Research Associate at Cornell 
University. 

Professor D. B, DeLury of the Virginia Polytechnic Institute has accepted a 
position with the Ontario Research Foundation, 43 Queens Park, Toronto 6, 
Canada. 

Professor Abel Gauthier of the University of Montreal has been appointed 
Head of the Institute of Mathematics arid Assistant-Secretary of the Faculty of 
Science at that institution. 

Dr. Casper Goffmon, former assistant professor in the Mathematics Depart- 
ment, Umversity of Kentucky, is now in the Mathematics Department, Univer- 
sity of Oklahoma, Norman, Oklahoma. 

Mr, Philip Hardy has returned to the General Eleotrio Company at Warren, 
Ohio after serving at Wright Field. 

Dr, Carl F. Kossack, who has been with the Navy Department in Washington, 
D. C, as an Air Intelligence Specialist, has accepted an associate professorship in 
the Department of Mathematics at Purdue University. 

Mr, Franlc Jones Massey, Jr. is now teaching in the Department of Mathe- 
matics, University of Maryland, College Park, Maryland. 
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Ur. William Burton Mlchmsl, who has been Lecturer in Mathematics Psv 
ehology and Mueat.onal Perchoiogy at lie Umyeraly el South™ SSi 
has now aecepted an asmstant protossoralnp in the Department ol Psy^Zy’ 
Piinctton Univeimty He is also a member of the Research Department’ 
College Lnlrance Examination. Board at Princeton. ^ ’ 

Mr. Tiemard Ostle a former teaching assistant, School of Business Adminis- 

at Iowa State College, Ames, Iowa. 

Mr. Maurme H Quenouille, who was formerly with the Rothamsted Experi- 
mental bUtion, llarpendon, lierts, England, has accepted the position of 
Lecturer in Statistics, Marisciml College, University of Aberdeen, Scotland. 

Dr. James A. Rafferty left the Department of Pathology, University of 
Rochester m June and has been appointed Chief of the Department of Statistics 
Air University, School of Aviation Medicine, Randolph Field, Texas. ’ 

Miss Maiy Ann Savas has accepted a position with General Motors Detroit 
Michigan. ’ ’ 

I lofessor Geoige J. Stigler, formerly with Brown University, is now teaching 
in the Department of Economics, Columbia University, Nqw York, New York. 

Piofessor E. L. W^elker bas resigned an associate professorship in mathematics 
at the University of Illinois to become Associate in Mathematics in the Bureau of 
'Medical Economic Research of the American Medical Association, 

Mr. Sol M. Wezelman, who completed his master’s degree in actuarial science 
at the University of Michigan in June, has accepted a position as Assistant 
Actuary in the North Dakota State Department of Insurance, Bismarck, 

Dr. Bertram Yood has received his doctorate at Yale and is now on the staff 
at Cornell University. 

Mr, Earl K. Yost, Jr. has accepted a position with the General Electric Co. at 
the Hanford Engineering Project, Richland, Washington 

Professor James G. Smith, of Princeton University, died at Princeton on 
November 28, 1946. 


Beginning with the October issue, the quarterly journal Mathematical Tables 
and Other Aids to Computation will publish a new feature section, “Automatic 
Computing Machinery,” designed to disseminate information and news on 
research and development in the field of high-speed automatic calculating 
machineiy. Material should fall under the general headings of Bibliography, 
Technical Developments, Discussion (including correspondence), and News. 
Contributions to this section are invited and should be addressed to Dr. E. W. 
Cannon, Head of the Mathematics Group, Machine Development Laboratory, 
National Bureau of Standards, Washington, D. C. 


Institute of Numerical Analysis Established 
Plans have been completed for the establishment of one of the newest units of 
the National Bureau of Standards — the Institute of Numerical Analysis at the 
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ITnivereity of California at Los Angelos, aerording to an aniioiniooment by Dr. 
Edward U. Condon, Director of the Bureau, 

One of the giant high-speed electronic coiujni Ling niachinoH, now under devel- 
opment by the Bureau of Standards, will be installed at the Institute when 
completed. Design spoGificalions call for high memory eapucitj' and auto- 
matically soquenced mathematical operations from start to finish at speeds 
attainable only with electronic eipiipment. 

The Institute has two primary functions. 'I'he first is rosetu’ch in applied 
mathematics aimed at dc\'eloping nictluHls of analysis wliich will e.vtcnd the use 
of the high-speed electronic computers. The second is to act as a sei-vice group 
for Western industries, research institutions, and governincut agencies. The 
service function will include not only the use of the machines for problem solving 
but also assistance in the formulation of problems in ap])hed mathematics of the 
more complex and novel types. Service oiKTatioms are to be initiated immedi- 
atels', using the latest types of commercially available computing equipment. 

The decision to locate the Institute at the University of CJaliforiua at Los 
Angele.s was made after a nation-wide suiwey bj’ the Xational Bureau of Btand- 
ards. (’enters in the East and Middle West were considered as well as the Far 
West, lint Los Angeles, it was decided, otTored the widest range of possibilities 
for an Institute of Numerical Analysis. Concentration of aircraft industries and 
the presence of several major scieutilic inslitutions were ciritical in the choice of 
Lo.s Angelos, 


Election of Fellows 

The Board of DiroctoVs announced at the Yale Meeting the election of the 
following members as Fellows of the Institute: Theodore W, Anderson, Jr., 
Alexander C. Aitkeu, David II, Blackwell, Georges Damnois, Ilognar Frisch, 
Robert C. Geary, Frederick Mostoller, Gerhard Tintner, Charles P. Winsor and 
John Wishart. 


New Members 


The following persons have been elecled lo membci'Hhip in Ihe Inslilute 
(J line i lo Augml 20, 19^7 ) 

Baldwin, Helen Mildred, H.S. (Ootnoll) Rosoarcli Assoeiato in HUiliatins, Atomic Energy 
Project, 2tS Avenue C, liochoUcr S , N, Y. 

Blunk, Paul M., A.B. Toaolung asst and grad, studoiil, Univ. of ('alif,, Box SS2, Fair 
Oaks, Calif, 

Bowden, George Edwin, B,S. (Duke) Toaehing aast., Matli. Eopt., While Hall, Cornell 
Univ., Ithaca, N. Y. 

Bradley, Ralph Allan, M.A (Queen's Univ.) Grad, student, Univ. North Carolina, Well- 
inglon, Ontario, Canada. 

Button, Kenneth John, Head of Statistics Section, Britisli Einployors’ Confederation, 1(1 
Eutherwyke Close, JHwell, Surrey, England. 

Carlson, Phillip G., Ji., A.M. (Columbia) H8 Cornell Slreel, Roshndale SI, Mass. 
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Carol, Bernard, M y E 
Street, N Y. 


(Columbia) 


Graduate student at Columbia Univ., 15 WcH 96lh 

« 7.*; “4,‘S^ ■ rf “ “ 

Diver, M. L., M.E. (Purdue) OmBulting Engincei, P.O Box 1016, San Antonio B Texas 

Erasmus, Jos.as^ C ^tellenbasch, South Africa) Reseaich’ Officer; 

t,iootfoiit(Mu (college of Agucultuie, Middelburg, C-P, South Africa. 

Gottlieb, Morns J , Ph D (Wash Univ , St. Louis) Member of the Institute for Advanced 
Study, Washington University, St Louis, Mo. 

Greenwood, Joseph Arthur, A B. (Harvard) Student at Harvard University BB Oxford 
St., Cambridge SS, Mass ’ '' 

Gysbers, Jack C., M.A. (Univ. of Calif ) Teaching asst.. Dept, of Math., Umv of Calif 
£0S9 Berkeley Way, Berkeley i, Calif. ’ 

Haskind, Mma, B.S. (Brooklyn College) Student at Brooklyn College, 763 Eastern Park- 
way, Brooklyn IS, Neiu York. 

Hauser, Dr. Philip M,, Ph.D (Univ of Chicago) University of Chicago, Chicago 37, 111, 

Hoyt, Cyril J., Ph.D. (Univ. of Minii.) Research Associate, Dept, of Education, Univer- 
sity of CUiiciigo, Chicago, 111 

Kern, Enrique Roberto, First Assistant, Institute of Biometry, Univ of Buenos Aires 
llivudavia 8854, Buenos Auos, Argentina 

Mark, Abraham M., PhD. (Cornell) Mathematics Department, Univ of Wisconsin, 
Madison, Wisconsin. 

Moss, George G., II, D.A. (St. John’s College, Annapolis) Actuarial Statistician, Metro- 
politan Life, S771 Morris Aoe , N Y. 58, N Y. 

Phillips, Bernard B., A.M. (Columbia) Box 147, Cathedral Station, New York &5, N. Y. 

Radvanyl, Laszlo, Ph D . (Univ, of Hcidelbeig) Professor of Economics, National Umv, of 
Mexico, Donato Guerra 1, desp. S07, Mexico, D F 

Richardson, John M., Ph.D, (Cornell) Member of Technical Staff, Bell Telephone Lah- 
oraloricB, Inc,, Murray Hill, New Jersey 

Royston, Robert W., M.S, (Univ. of Mich.) Asst. Prof , Math. Dept., Wash. & Lee Umv , 
117 W. Washington St., Lexington, Virginia 

Savas, Mary A., A.B. (Umv of Mich ) Student at Umv. of Mich., 334 E Second St , Mon- 
roe, Mich. 

Shepard, David H., AB (Univ of Mich ) Research Analyst, Army Secuiity Agency, 
BOS Itandolph Street, Falls Chw ch, Virginia. 

Throdahl, Monte C., B.S (Iowa State College) Research Chemist in Charge of Rubber 
Lab., Monsanto Chemical Co., Nitro, West Virginia 

Uohytil, Jan, Doctor of Science, Chief of Production Control Dep. m Central Federation 
of Czech. Industry, Praha II, Prikopy 14, Czech. 

Vergara, Jose, Doctor of Engine ■ • jt. A'- ’ ' U P ^ isor of Economics, Madrid, Chief 

5868 S Blaokstone Ave., Chicago 


of tho Bureau of Statistics, P. |.. <. 'li’< .M-o 

87, Illinois. 

Wei, Dzung-shu, Ph.D (Umv of Iona) Prof and Hoad of Math Dept., St. John’s Univ., 
Shanghai, 189 East, lOth Si , Xi w I Oik S, A J 
Wolfson, Jacob, B.A (New f’ork College) Blatisticinn, Social Seouiity Adm , 845 Bruns- 
wick Itoad, Essex, Marvland. 



REPORT ON THE NEW HAVEN MEETING OF THE INSTITUTE 

The Tenth Summer 3I(;t'ling ef the Institute of Malhematieul iSLatistics waa 
held at Yale University, New Haven, (hmneeticut, Tuesday, .Seiiieinher 2 
through Thursday, ,Sei)temlier 4, 1!)17. The meeting was held in conjuiiation 
with the surunier meetings of the AmerieAii Mathematieul Hociety and the 
Matheinatieal Association of America. 'I'hc follovving ir>() memhers of the 
Institute attended the meeting; 

C H Allciidoerfor, It. L. Aiulersoa, II. K. Ariinki, I,. A, .Vniimi, Il.M. IWoii, J. I/. Uarnes, 
W. I). Hulc'ii, It. E. Utiflilioler, A A. HciiiieU, JoHt'|ili itcrksoii, 1). IL Isliickttcll, 0. 1. Bliai, 
Cohn lilytli, Jr., A. E. Hiandt, ti. M. Hruwii, It. 11. Hrowii, t). 1’, linino, P. T Hruyere, 
Mrs. I’.T. Uruyorr, J 11 Uiiahoy, B, II. Ciuiip, (i. C ('aniidndl. I'lUiiu Cliaiid, K. E, Chung, 
W. Cl. Cmliran, /V. C. Cohen, Jr., K P Colpinau, T. E. Cuiie, tl. M. Ci)\, C'. C'. Craig, E, L, 
Grow, 11. li Curry, G. B. ])ant7.ig, M. 1). Dartinv, H. li. Jhiv, liernard Dinmctile, G. E. 
Dieuk'Cait, C.W, Dunnett, CliurrhiU Kiw>nlmrt, 1.. U. Klvidavk, M. \V. Eudey, 11, P.Evaua, 
William lAller, G. I). Eei'ria.M.M. Elwid, It. Al. Eosler, II. .A. Frccimin. J. K. Knnind, H. P. 
Goiringer, M. J (ioltlieb, J. ArUiur ClrconwiKid, Evelyn tirtKWiiiuu, F. E. Gruliha, II. T. 
Guard, P. it. llalmos, Max IlaliHirin, M. II. llatisen, B, 1. llarl, Mina Ilnakind, Wassily 
IloeHcling, It. 11, Hoskins, Harold Ihilelliiig, A. S. Iloufeoliolder, .lamalav Jaiiko, Irving 
ICaplansky, Isio Katz, Oscar KnmpUiomc, E. M. Kennedy, W. L. Kielihne, G, J, Kirrlicn, 
L. F. Knudson, H. B. Konijn, 0, P. Kossack, Jack l.,«dnni an, II. (1. 14indaii, K. L, ladmiann, 
H. A Loibler, Walter IjiigUUia, Jr., F. (.! I,eoue, Josepli l,ev, Huttiird l.evc>ut>, Julius Isilb* 
loin, Arthur riindni', S. B. Utlouer, K. 1). Imwrj-, H, F. .MarXeisb, P. J. MeCarlhy, John 
Mandel, H, B. Mann, Sopliio Marcuse, F. J, MnsHoy, Margarul Merridl, K. B, Mode, M, B, 
Mooro, FredanckMoatullor, IX N. Naiidn, P. M. Noumtli, M. G, Keimlenburg, U. E. Noe- 
thor, M. L. Nordon, H. W. Norton, P. H. Olmatoad, A. L. 0 ’T(k.Ip, E. U. Oil, T. K. Oxtoby, 
Edward Paulson, M. P, Poiaakoff, G. B. Prico, J. A. Rafferty, Ii. J, Reed, 0. 3 . Ueos, P. R,. 
Elder, Jolin Eiordan, H. E. Iloblilna, Millon da Bilva llodriguea, X, (’. Roaander, Ernest 
Rubin, Herman Rubin, Frank Saidol, M. M. Saiidorairc, Arthur Bard, Max Sasuiy, F. E, 
Batterthwaito, E. D. Soliell, Jack Sherman, EoBodith Silgroavos, Andrew Sobesyk, MUlon 
Sobel, Herbert Solomon, Mortimer Spicgolman, Arthur Stein, Henry Teicher, It. M. Thrall, 
Gerliard Tintnor, M N. Torrey, J. W. Tukey, D. F. Votaw, Jr,, Abraham Wald, II. M. 
Walker, J. E. Walsh, R. M. Walter, J. It. Watkins, Dzung-shu Wei, E. S. W'eiSB, S. 8, Wilks, 
0. P. Winsor, II. 0. Wold, Jacob WolfowiU, C. A. Wright, Rertrara Yood. 

The Tuesday afternoon session was devoted to a symposium on 2 x 2 tables 
with Professor Lowell J. Reed of Johns Hopkins University serving as chairman. 
Addresses were given on Tests of Significance by Dr. ChurohiU Eisenhart, Na- 
tional Bureau of Standards; EaUmation by Dr. Charles P. Winsor, Johns Hopkins 
University and Non-Standard Caaea by Dr. Joseph Berkson, Mayo CUnio. 
Disoussants were Mr. William F, Taylor, Dr. Frederick Mosteller, Professor 
David H, Blackwell and Professor John W. Tukey, The attendance was 
approximately 130. 

The first Wednesday morning session was devoted to contributed papers. 
Professor John W. Tukey of Princeton University presided. The attendianoe 
was approximately 86. The following three papers were presented: 

1. Ealimation of Paraineiers in Truncated Pearson Frequency Distributions. 

Professor A C Cohen, University of Georgia. 
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2. Distribution of a Root of a Detcrminanial Equation 
Mr. D. N. Nanda, Univeiaity of North. Carolina. 

3 The Rower of Certain N on-Pai ameinc Tesla of Independence 
Dr. Wassily IIoBffdiijg, University of North Carolina 

The second Wednesday morning session was held with Professor Will Feller, 
President of the Institute, presiding. Professor R. A. Fisher, University of 
Cambridge, gave the address under the title The Fitting of Gene Frequencies to 
Data for Genotypes. The attendance was approximately 160 

The membership business meeting of the Institute was held at 9 : 16, Thursday 
morning, in 102 Chittenden Hall with President Feller presiding The attend- 
ance was approximately 65 It was voted to make certain changes m the 
By-Laws and in particular to raise the due to $7 00 per year. (An exception is 
made for those living outside the Western Hemisphere.) Morris Hanasn 
Chairman of the Committee on Planmng and Development, initiated a lively 
discussion with reference to desirable changes in the Constitution 

On Thursday morning at 10:30, with President Feller presiding, Professor A. 
Wald of Columbia University presented the Henry Lewis Rietz Lecture on 
Sequential Estimation and Midti-Decisions. The attendance was approximately 
150. 

A joint session with the American Mathematical Society was held early 
Thursday afternoon at which Professor S. S Wilks of Princeton University gave 
a lecture on Sampling Theory of Order Stakskes. Professor Harold Hotelling of 
the University of North Carolina was the presiding officer. The attendance was 
approximately 300. 

This session was followed by another joint session with the American Mathe- 
matical Society which was devoted to contributed papers. Professor John W. 
Tukey presided at this session and the attendance was approximately 115. The 
following seven papers were presented: 


1. Some Significance Tests for the Mean Using the Sample Range and Midrange. 

Mr. John Walsh, Princeton University 

2. Testing Compound Symmetry in a Normal Multivariate Distribution. 

Dr. David P. Votaw, Jr., Princeton University 

3 Fr,,,>a . X,r -X'l I'igh Significance Levels Professor Harold Hotelling, 


L .11 > O'n.'.' I'. Ao 
4 0) i'l R 


I)i l.ii: I I.. I ' ' 
fic'.i ''(■ 1. r.i''..i-.i 

6. The Fourth Degr 


■I ( !.'■ 


n-.i', 1 ■ I'ersity of California, Beikeley and Professor Henry 
.1 ( I. ! lima at Los Angeles, 
j-ww, w. ...^gree Exponential Function. 

Dr. Loo A. Aroian and Professor Marguerite Darkow, Hunter College 
6. On the Maximum Partial Sums of Sequences of Independent Dislributions. 


Dr. K. L. Chung, Princeton University o. i .• 

7. Some BeSidta on the Distribution of Quadratic Forms from Gaussian Stochastic 

Processes. 

Mr, Herman Eubm, Cowles Commission. 

The following four papers were presented by title 

8. A General Weak Umil Theorem for Independent Distributions 
Professor P. L Hsu, University of North Carolina 
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9. iS'oHie Significance Tc^ts for ihe Median which arc Valid under Very General Condi- 
lions (Preliminary ReiKirL). 

Jlr. John E. Walsh, Princeton University. 

10. Loss of Information in l-lests wtlh Unbalanced Samples (Preliminary Report), 
Mr. John. 13 Walsh, Princeton University. 

IX. Some Theorems on Ihe lierrioullian MuUipltcaiive Process 
Mr. T. E, Harris, Princeton University. 

Abstracts of all these papers appear elsewhere in this issue of the Annals. 

A beer party in honor of the foreign statisticians attending the meeting was 
held in the dining room of Saybrook College on Wednesday evening. A joint 
dinner with the American Mathematical Society and the Mathematical Associ- 
ation of America was held on Thursday evening. 

C. C. Craig, 

Acting Secretary. 
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